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Preface 


These  are  the  proceedings  of  the  Sixth  International  Conference  on  Logic  Pro¬ 
gramming  and  Nonmonotonic  Reasoning  (LPNMR2001).  The  conference  was 
held  in  Vienna  from  17th  to  19th  of  September,  2001.  It  was  collocated  with  the 
Joint  German/ Austrian  Conference  on  Artificial  Intelligence  (24th  German/9th 
Austrian  Conference  on  Artificial  Intelligence),  KI2001. 

LPNMR  conferences  aim  to  promote  research  in  logic-based  programming 
languages,  database  systems,  nonmonotonic  reasoning,  and  knowledge  repre¬ 
sentation.  LPNMR  2001  was  the  sixth  conference  in  the  series.  The  previous 
meetings  were  held  in  Washington,  DC,  in  1991,  in  Lisbon,  Portugal,  in  1993,  in 
Lexington,  Kentucky,  in  1995,  in  Dagstuhl,  Germany,  in  1997,  and  in  El  Paso, 
Texas,  in  1999. 

The  technical  program  of  LPNMR  2001  was  comprised  of  five  invited  talks 
that  were  given  by  Jurgen  Dix,  Georg  Gottlob,  Phokion  Kolaitis,  Maurizio  Lenz- 
erini,  and  Chiaki  Sakama.  It  also  contained  23  technical  presentations  selected 
by  the  program  committee  during  a  rigorous  review  process.  Finally,  as  a  part 
of  the  technical  program,  the  conference  featured  a  special  session  comprised  of 
nine  presentations  and  demonstrations  of  implemented  nonmonotonic  reasoning 
systems.  All  these  contributions  axe  included  in  the  proceedings. 

Many  individuals  worked  for  the  success  of  the  conference.  Special  thanks 
are  due  to  all  members  of  the  program  committee  and  to  additional  reviewers 
for  their  efforts  to  produce  fair  and  thorough  evaluations  of  submitted  papers. 
Furthermore,  we  would  like  to  thank  the  members  of  the  Knowledge  Based 
Systems  Group  of  the  Vienna  University  of  Technology,  which  took  care  of  the 
local  organization.  We  particularly  appreciated  the  never  tiring  effort  of  Elfriede 
Nedoma,  secretary  to  the  group.  We  would  also  like  to  thank  Gerd  Brewka  for 
his  supportive  role  in  arranging  the  collocation  of  the  conference  with  KI2001. 
Last,  but  not  least,  we  thank  the  sponsoring  institutions  for  their  generosity. 
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Wolfgang  Faber 
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A  Computational  Logic  Approach  to 
Heterogenous  Agent  Systems 


Jiirgen  Dix  * 

The  University  of  Manchester,  Dept,  of  CS 
Oxford  Road,  Manchester  M13  9PL,  UK 
dixOcs. man. ac.uk 
http://www.es .man.ac.uk/~jdix 


Abstract.  I  report  about  a  particular  approach  to  heterogenous  agent 
systems,  IMPACT,  which  is  strongly  related  to  computational  logic.  The 
underlying  methods  and  techniques  stem  from  both  non-monotonic  rea¬ 
soning  and  logic  programming.  I  present  three  recent  extensions  to  il¬ 
lustrate  the  generality  and  usefulness  of  the  approach:  (1)  incorporating 
planning,  (2)  uncertain  (probabilistic)  reasoning,  and  (3)  reducing  the 
load  of  serving  multiple  requests.  While  (1)  illustrates  how  easy  it  is  to 
incorporate  hierachical  task  networks  into  IMPACT,  (2)  makes  heavily 
use  of  annotated  logic  programming  and  (3)  is  strongly  related  to  classi¬ 
cal  first-order  reasoning.  This  paper  is  a  high-level  description  of  (l)-(3), 
More  detailed  expositions  can  be  found  in  [1 ,2,3,4]  from  which  most  parts 
of  this  paper  are  taken. 


1  The  Basic  Framework 

The  IMPACT  project  (http :  //www .  cs .  umd .  edu/pro  j  ect s/impact)  aims  at  de¬ 
veloping  a  powerful  multi  agent  system,  which  (1)  is  able  to  deal  with  heteroge¬ 
nous  and  distributed  data,  (2)  can  be  realized  on  top  of  arbitrary  legacy  code, 
but  yet  (3)  is  built  on  a  clear  foundational  bases  and  (4)  scales  up  for  realistic 
applications. 

In  this  article  I  am  pointing  to  some  recent  extensions  of  the  basic  frame¬ 
work  (which  has  been  implemented  and  is  running)  that  show  very  clearly  the 
strong  links  to  computational  logic,  even  though  IMPACT’S  implementation  is 
not  realized  on  top  of  a  logic  related  procedural  mechanism. 

To  get  a  bird’s  eye  view  of  IMPACT^  here  are  the  most  important  features: 

-  Each  IMPACT  agent  has  certain  actions  available.  Agents  act  in  their  en¬ 
vironment  according  to  their  agent  program  and  a  well  defined  semantics 
determining  which  of  the  actions  the  agent  should  execute. 

-  Each  agent  continually  undergoes  the  following  cycle: 

*  The  work  I  am  reporting  has  been  done  with  many  colleagues,  notably  Th.  Eiter, 
S.  Kraus,  K.  Munoz-Avila,  M.  Nanni,  D.  Nau,  F.  Ozcan,  T.J.  Rogers,  R.  Ross 
and,  last  but  not  least,  V.S  Subrahmanian.  It  resulted  in  a  variety  of  papers  and  I 
gratefully  acknowledge  their  support. 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  1-20,  2001. 

(c)  Springer- Verlag  Berlin  Heidelberg  2001 


2 


Jurgen  Dix 


IMPACT  Architecture 


Fig,  1.  SHOP  as  a  planning  agent  in  IMPACT 


(1)  Get  messages  by  other  agents.  This  changes  the  state  of  the  agent. 

(2)  Determine  (based  on  its  program,  its  semantics  and  its  state)  for  each 
action  its  status  (permitted,  obliged,  forbidden,  . . . ).  The  agent  ends  up 
with  a  set  of  status  atoms. 

(8)  Based  on  a  notion  of  concurrency,  determine  the  actions  that  can  be 
executed  and  update  the  state  accordingly. 

—  IMPACT  Agents  are  built  on  top  of  arbitrary  software  code  (Legacy  Data). 

-  A  methodology  for  transforming  arbitrary  software  (legacy  code)  into  an 

agent  has  been  developed. 

A  complete  description  of  all  these  notions  is  out  of  scope  of  this  paper  and  we 
refer  to  [3]  for  a  detailed  presentation. 

Before  explaining  an  agent  in  more  detail,  we  need  to  make  some  comments 
about  the  general  architecture.  In  IMPACT  agents  communicate  with  other 
agents  through  the  network.  Not  only  can  they  send  out  (and  receive)  messages 
from  other  agents,  they  can  also  ask  the  server  to  find  out  about  services  that 
other  agents  offer.  For  example  a  planning  agent  (let  us  call  it  A-SHOP),  con¬ 
fronted  with  a  particular  planning  problem,  can  find  out  if  there  are  agents  out 
there  with  the  data  needed  to  solve  the  planning  problem;  or  agents  can  provide 
A-SHOP  with  information  about  relevant  legacy  data. 

One  of  the  main  features  of  IMPACT  is  to  provide  a  method  (see  [3])  for 
agentizing  arbitrary  legacy  code,  i.e.  to  turn  such  legacy  code  into  an  agent.  In 
order  to  do  this,  we  need  to  abstract  from  the  given  code  and  describe  its  main 
features.  Such  an  abstraction  is  given  by  the  set  of  all  datatypes  and  functions 
the  software  is  managing.  We  call  this  a  body  of  software  code  and  denote  it  by 
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S  =  {Ts,  J^s).  Ts  is  0,  set  of  predefined  functions  which  makes  access  to  the  data 
objects  managed  by  the  agent  available  to  external  processes. 

For  example,  in  many  applications  a  statistics  agent  is  needed.  This  agent 
keeps  track  of  distances  between  two  given  points  and  the  authorized  range  or  ca¬ 
pacity  of  certain  vehicules.  These  information  can  be  stored  in  several  databases. 
Another  example  is  the  supplier  agent.  It  determines  through  its  databases 
which  vehicles  are  accessible  at  a  given  location. 

Definition  1  (State  of  an  Agent,  Os(t)).  At  any  given  point  t  in  time,  the 
state  of  an  agent,  denoted  Os{t),  is  the  set  of  all  data  objects  that  are  currently 
stored  in  the  relations  the  agent  handles — the  types  of  these  objects  must  be  in 
the  base  set  of  types  in  Ts . 

In  the  examples  just  mentioned,  the  state  of  the  statistics  agent  consists  of  all 
tuples  stored  in  the  databases  it  handles.  The  state  of  the  supplier  agent  is  the 
set  of  all  tuples  describing  which  vehicles  are  accessible  at  a  given  location. 

We  noted  that  agents  can  send  and  receive  messages.  There  is  therefore  a 
special  datastructure,  the  message  box,  part  of  each  agent.  This  message  box  is 
just  one  of  those  types.  Thus  a  state  change  occurs  already  when  a  message  is 
received. 


1.1  The  Code  Call  Machinery 

To  perform  logical  reasoning  on  top  of  third  party  data  structures  (which  are 
part  of  the  agent’s  state)  and  code,  the  agent  must  have  a  language  within 
which  it  can  reason  about  the  agent  state.  We  therefore  introduce  the  concept 
of  a  code  call  atom,  which  is  the  basic  syntactic  object  used  to  access  multiple 
heterogeneous  data  sources. 

Definition  2  (Code  Calls  (cc)).  Suppose  S  =def  is  some  software 

code,  f  G  Ts  is  a  predefined  function  with  n  arguments,  and  di, . . . ,  are  objects 
or  variables  such  that  each  d±  respects  the  type  requirements  of  the  i  ’th  argument 
of  f.  Then,  <S:/(di, . . .  ,dn)  is  a  code  call.  A  code  call  is  ground  if  all  the  di ’s 
are  objects. 

We  often  identify  software  code  S  with  the  agent  that  is  built  on  top  of  it. 
This  is  because  an  agent  really  is  uniquely  determined  by  it. 

A  code  call  executes  an  API  function  and  returns  as  output  a  set  of  objects 
of  the  appropriate  output  type.  Going  back  to  our  two  agents  introduced  above, 
statistics  may  be  able  to  execute  the  cc  statistics :  distance (locFrom,  locTo). 
The  supplier  agent  may  execute  the  following  cc: 
supplier :  cargoPlane(locFTom). 

What  we  really  need  to  know  is  if  the  result  of  evaluating  such  code  calls 
is  contained  in  a  certain  set  or  not.  To  do  this,  we  introduce  code  call  atoms. 
These  are  logical  atoms  that  are  layered  on  top  of  code  calls.  They  are  defined 
through  the  following  inductive  definition. 
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Definition  3  (Code  Call  Atoms  (in(X,  cc))).  If  cc  is  a  code  call,  andX  is 
either  a  variable  symbol,  or  an  object  of  the  output  type  ofcc,  then  in(X,  cc)  and 
notJn(X,  cc)  are  code  call  atoms.  notJn(X,  cc)  succeeds  if  X  is  not  in  the  set 
of  objects  returned  by  the  code  call  cc. 

Code  call  atoms,  when  evaluated,  return  boolean  values,  and  thus  may  be  thought 
of  as  special  types  of  logical  atoms.  Intuitively,  a  code  call  atom  of  the  form 
in(X,  cc)  succeeds  if  X  can  be  set  to  a  pointer  to  one  of  the  objects  in  the  set  of 
objects  returned  by  executing  the  code  call. 

As  an  example,  the  code  call  atom 

in(/22,  supplier :  cflri^oP/ane  (collegepark))  tells  us  that  the  particular  plane 
“/22”  is  available  as  a  cargo  plane  in  College  Park. 

Often,  the  results  of  evaluating  code  calls  give  us  back  certain  values  that 
we  can  compare.  Based  on  such  comparisons,  certain  actions  might  be  fired 
or  not.  To  this  end,  we  need  to  define  code  call  conditions.  Intuitively,  a  code 
call  condition  is  a  conjunction  of  code  call  atoms,  equalities,  and  inequalities. 
Equalities,  and  inequalities  can  be  seen  as  additional  syntax  that  “links”  together 
variables  occurring  in  the  atomic  code  calls. 

Definition  4  (Code  Call  Conditions  (ccc)). 

1.  Every  code  call  atom  is  a  code  call  condition. 

2.  If  s,t  are  either  variables  or  objects,  then  s  =  t  is  a  code  call  condition. 

3.  If  s,t  are  either  integer/real  valued  objects,  or  are  variables  over  the  inte¬ 
gers/reals,  then  s<t,s>t,s>t,s<t  are  code  call  conditions. 

4’  VXi^X2  are  code  call  conditions,  then  Xi^X2  is  a  code  call  condition. 

A  code  call  condition  satisfying  any  of  the  first  three  criteria  above  is  an  atomic 
code  call  condition. 
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1.2  Agent  Programs  and  Semantics 

We  are  now  coming  to  the  very  heart  of  the  definition  of  an  agent:  its  agent 
program.  Such  a  program  consists  of  rules  of  the  form: 

CCCi,  .  .  .  jCCCr, 

where  Q:,ydi, . .  are  actions  (the  agent  can  execute),  Opi,...,Op„  describe 
the  status  of  the  action  {obliged^  forbidden,  waived,  doable)  and  ccCi  are  code  call 
conditions  to  be  evaluated  in  the  actual  state. 

Thus,  Opj  are  operators  that  take  actions  as  arguments.  They  describe  the 
status  of  the  Eirguments  they  take.  Here  are  some  examples  of  actions:  (1)  to  load 
some  cargo  from  a  certain  location,  (2)  to  fly  a  plane  from  a  certain  location  to 
another  location,  (3)  to  unload  some  cargo  from  a  certain  location.  The  action 
status  atom  ¥load  (resp.  T>ofiy)  means  that  the  action  load  is  forbidden  (resp. 
fly  should  be  done).  Actions  themselves  are  terms,  only  with  an  operator  in  front 
of  them  they  become  atoms. 

In  IMPACT,  actions  are  very  much  like  STRIPS  operators:  they  have  pre¬ 
conditions  and  add  and  delete-lists  (see  appendix).  The  difference  to  STRIPS  is 
that  these  preconditions  and  lists  consist  of  arbitrary  code  call  conditions,  not 
just  of  logical  atoms. 

Figure  2  illustrates  that  the  agent  program  together  with  the  chosen  seman¬ 
tics  SEM  and  the  state  of  the  agent  determines  the  set  of  all  status  atoms. 
However,  the  doable  actions  among  them  might  be  conflicting  and  therefore  we 
have  to  use  the  chosen  concurrrency  notion  to  finally  determine  which  actions 
can  be  concurrently  executed.  The  agent  then  executes  these  actions  and  changes 
its  state. 


1.3  Evaluability  of  ccc’s 

Code  call  conditions  provide  a  simple,  but  powerful  language  syntax  to  access 
heterogeneous  data  structures  and  legacy  software  code.  However,  in  general 
their  use  in  agent  programs  is  not  limited.  In  particular,  it  is  possible  that  a  ccc 
can  not  be  evaluated  (and  thus  the  status  of  actions  can  not  be  determined)  sim¬ 
ply  because  there  are  uninstantiated  variables  and  thus  the  underlying  functions 
can  not  be  executed.  Here  is  a  simple  example. 

Example  1  (Sample  ccc).  The  code  call  condition 

in(FinaLnceRec,  rel :  select{financeRel,  date,  "  =  "11/15/99"))  Sz 

FinanceRec. sales  >  lOK  Sz 

in(C,  excel:  chart{excelFile,F±rLaoa.ceRec,  day))  &: 

in(Slide,  ppt :  include{C,  "presentation.ppt")) 

is  a  complex  condition  that  accesses  and  merges  data  across  a  relational  database, 
an  Excel  file,  and  a  PowerPoint  file.  It  first  selects  all  financial  records  associated 
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Fig.  3.  A  code  call  evaluation  graph  (cceg) 

with  "11/15/99":  this  is  done  with  the  variable  FinanceRec  in  the  first  line.  It 
then  filters  out  those  records  having  sales  more  than  10 AT  (second  line).  Using 
the  remaining  records,  an  Excel  chart  is  created  with  day  of  sale  on  the  x-axis 
and  the  resulting  chart  is  included  in  the  PowerPoint  file  "present ation.ppt" 
(fourth  line). 

In  the  above  example,  it  is  very  important  that  the  first  code  call  be  evaluable.  If 
financeRel  were  a  variable,  then  rel :  select {FlnanceKel^  date, "  =  ","11/15/99") 
would  not  be  evaluable,  unless  there  were  another  condition  instantiating  this 
variable. 

We  have  introduced  syntactic  conditions,  similar  to  safety  in  classical  data¬ 
bases,  to  ensure  evaluability  of  ccc’s.  It  is  also  quite  easy  to  store  ccc’s  as  eval¬ 
uation  graphs  (see  Figure  3),  thereby  making  explicit  the  dependency  relation 
between  its  constituents  (see  [4]).  It  is,  however,  still  perfectly  possible  that  the 
execution  of  a  code  call  does  not  terminate  and  we  have  to  add  another  condition 
to  ensure  termination  (see  Subsection  2.3). 

2  Planning 

In  this  section  we  show  how  an  HTN  planning  system,  SHOP,  can  be  integrated 
to  the  IMPACT  multi-agent  environment.  We  define  the  A-SHOP  algorithm,  an 
agent ized  adaptation  of  the  original  SHOP  planning  algorithm  ([5])  that  takes  ad¬ 
vantage  of  impact’s  capabilities  for  interacting  with  external  agents,  perform¬ 
ing  mixed  symbolic/numeric  computations,  and  making  queries  to  distributed, 
heterogeneous  information  sources  (such  as  arbitrary  legacy  and/or  specialized 
data  structures  or  external  databases).  We  also  show  that  A-SHOP  is  both  sound 
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and  complete  if  certain  conditions  (related  to  evaluability  and  termination  of  the 
underlying  code  calls)  are  met. 


2.1  HTN  Planning 

Rather  than  giving  a  detailed  description  of  the  kind  of  HTN  planning  used  by 
SHOP  ([5]),  we  consider  the  following  example  taken  from  [2]. 

In  order  to  do  planning  in  a  given  planning  domain,  SHOP  needs  to  be  given 
knowledge  about  that  domain.  SHOP’S  knowledge  base  contains  operators  and 
methods.  Each  operator  is  a  description  of  what  needs  to  be  done  to  accomplish 
some  primitive  task,  and  each  method  is  a  prescription  for  how  to  decompose 
some  complex  task  into  a  totally  ordered  sequence  of  subtasks,  along  with  various 
restrictions  that  must  be  satisfied  in  order  for  the  method  to  be  applicable. 

Given  the  next  task  to  accomplish,  SHOP  chooses  an  applicable  method,  in¬ 
stantiates  it  to  decompose  the  task  into  subtasks,  and  then  chooses  and  instan¬ 
tiates  other  methods  to  decompose  the  subtasks  even  further.  If  the  constraints 
on  the  subtasks  prevent  the  plan  from  being  feasible,  SHOP  will  backtrack  and 
try  other  methods. 

As  an  example,  Figure  4  shows  two  methods  for  the  task  of  traveling  from 
one  location  to  another:  travelling  by  air^  and  travelling  by  taxi.  Travelling  by  air 
involves  the  subtasks  of  purchasing  a  plane  ticket,  travelling  to  the  local  airport, 
flying  to  an  airport  close  to  our  destination,  and  travelling  from  there  to  our 
destination.  Travelling  by  taxi  involves  the  subtasks  of  calling  a  taxi,  riding  in 
it  to  the  final  destination,  and  paying  the  driver. 

Note  that  each  method’s  preconditions  are  not  used  to  create  subgoals  (as 
would  be  done  in  action-based  planning).  Rather,  they  are  used  to  determine 
whether  or  not  the  method  is  applicable:  thus  in  Figure  4,  the  travel  by  air 
method  is  only  applicable  for  long  distances,  and  the  travel  by  taxi  method  is 
only  applicable  for  short  distances. 

Here  are  some  of  the  complications  that  can  arise  during  the  planning  process: 

—  The  planner  may  need  to  recognize  and  resolve  interactions  among  the  sub¬ 
tasks.  For  example,  in  planning  how  to  travel  to  the  airport,  one  needs  to 
make  sure  one  will  arrive  at  the  airport  in  time  to  catch  the  plane.  To  make 
the  example  in  Figure  4  more  realistic,  such  information  would  need  to  be 
specified  as  part  of  SHOP’s  methods  and  operators. 


p>avel(UMD.  MIT)  j _ 

"H  buy  ticket(B\W,  Logan) 


travel(UMD,  BWI) 


gett^ 

ride  taxi(UMD,'  BWI) 
pay  driver 


£ly(BWI,  Logan) 
travel(Logan,  MIT) 


^  get  taxi 

ride  taxi(Logan,  MIT) 


Fig.  4.  Travel  planning  example 
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-  In  the  example  in  Figure  4,  it  was  always  obvious  which  method  to  use.  But 
in  general,  more  than  one  method  may  be  applicable  to  a  task.  K  it  is  not 
possible  to  solve  the  subtasks  produced  by  one  method,  SHOP  will  backtrack 
and  try  another  method  instead. 


2.2  Agentization  of  SHOP 

A  comparison  between  IMPACT^ s  actions  and  SHOP’s  methods  shows  that  IM¬ 
PACT  actions  correspond  to  fully  instantiated  methods,  i.e.  no  subtasks.  While 
shop’s  methods  and  operators  are  based  on  STRIPS,  the  first  step  is  to  modify 
the  atoms  in  SHOP’s  preconditions  and  effects,  so  that  SHOP’s  preconditions 
will  be  evaluated  by  IMPACTS  code  call  mechanism  and  the  effects  will  change 
the  state  of  the  IMPACT  agents.  This  is  a  fundamental  change  in  the  representa¬ 
tion  of  SHOP.  In  particular,  it  requires  replacing  SHOP’s  methods  and  operators 
with  agentized  methods  and  operators.  These  are  defined  as  follows. 

Definition  5  (Agentized  Method:  (AgentMeth  hxt)  ).  An  agentized  me¬ 
thod  is  an  expression  of  the  form  (AgentMeth  hx^)  where  h  (the  method^ s 
headj  is  a  compound  task,  x  (^he  method’s  preconditions^  is  a  code  call  condition 
and  t  is  a  totally  ordered  list  of  suhtasks,  called  the  task  list. 

The  primary  difference  between  definition  of  an  agentized  method  and  the 
definition  of  a  method  in  SHOP  is  as  follows.  In  SHOP,  preconditions  were  logical 
atoms,  and  SHOP  would  infer  these  preconditions  from  its  current  state  of  the 
world  using  Horn-clause  inference.  In  contrast,  the  preconditions  in  an  agentized 
method  are  IMPACT’S  code  call  conditions  rather  than  logical  atoms,  and  A- 
SHOP  (the  agentized  version  of  SHOP  defined  in  the  next  section)  does  not 
use  Horn-clause  inference  to  establish  these  preconditions  but  instead  simply 
invokes  those  code  calls,  which  are  calls  to  other  agents  (which  may  be  Horn- 
clause  theorem  provers  or  may  instead  be  something  entirely  different). 

Definition  6  (Agentized  Operator:  (AgentOp  hXaddXdei)  )•  An  agentized 
operator  is  an  expression  of  the  form  (AgentOp  hXaddXdei),  where  h  (the 
headj  is  a  primitive  task  and  Xadd  and  Xdei  o,re  lists  of  code  calls  (called  the 
add-  and  delete-lists^.  The  set  of  variables  in  the  tasks  in  Xadd  Ci^d  Xdei  is  a 
subset  of  the  set  of  variables  in  h. 


The  Algorithm 

The  A-SHOP  algorithm  is  now  an  easy  adaptation  of  the  original  SHOP  algo¬ 
rithm.  Unlike  SHOP  (which  would  apply  an  operator  by  directly  inserting  and 
deleting  atoms  from  an  internally-maintained  state  of  the  world),  A-SHOP  needs 
to  reason  about  how  the  code  calls  in  an  operator  will  affect  the  states  of  other 
agents.  One  might  think  the  simplest  way  to  do  this  would  be  simply  to  tell  these 
agents  to  execute  the  code  calls  and  then  observe  the  results,  but  this  would  not 
work  correctly.  Once  the  planning  process  has  ended  successfully,  A-SHOP  will 
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procedure  A-SHOP(t,I>) 

1.  if  t  —  nil  then  return  nil 

2.  t  :=  the  first  task  in  t;  i?  ;=  the  remaining  tasks 

3.  if  t  is  primitive  and  a  simple  plan  for  t  exists  then 

4.  q  :=  simplePlan{t) 

5.  return  concatenate{q,  A-SHOP(i?,  V)) 

6.  else  if  t  is  non-prim.  A  there  is  a  reduction  of  t  then 

7.  nondeterministically  choose  a  reduction: 

Nondeterministically  choose  an  agentized  method, 

(AgentMeth  hxt),  with  the  most  general 
unifier  of  h  and  t  and  substitution  0  s.t. 

XfJ'O  is  ground  and  holds  in  IMPACT^s  state  O. 

8.  return  A-SHOP{concatenate{tp.6,R)^V) 

9.  else  return  FAIL 

10.  end  if 
end  A-SHOP 

procedure  simplePlan{t) 

11.  nondeterministically  choose  agent,  operator 
op  =  (Agent Op  hxadd  Xdei)  with  u  the  most 
general  unifier  of  h  and  t  s.t.  h  is  ground 

12.  mOTLitoring  :  apply  {op  v) 

13.  return  opv 

end  A-SHOP _ 

Fig.  5.  A-SHOP,  the  agentized  version  of  SHOP 

return  a  plan  whose  operators  can  be  applied  to  modify  the  states  of  the  other 
IMPACT  agents — but  A-SHOP  should  not  change  the  states  of  those  agents  dur¬ 
ing  its  planning  process  because  this  would  prevent  A-SHOP  from  backtracking 
and  trying  other  operators. 

Thus  in  Step  12,  SHOP  does  not  issue  code  calls  to  the  other  agents  directly, 
but  instead  communicates  them  to  a  monitoring  agent.  The  monitoring  agent 
keeps  track  of  all  operators  that  are  supposed  to  be  applied,  without  actually 
modifying  the  states  of  the  other  IMPACT  agents.  When  A-SHOP  queries  for  a 
code  call  cc  =  :/(di, . . . ,  dn)  in  x  to  evaluate  a  method’s  precondition  (Step 

7),  the  monitoring  agent  examines  if  cc  has  been  affected  by  the  intended 
modifications  of  the  operators  and,  if  so,  it  evaluates  cc.  If  cc  is  not  affected  by 
application  of  operations,  IMPACT  evaluates  cc  (i.e.,  by  accessing  S).  The  list 
of  operators  maintained  by  the  monitoring  agent  is  reset  everytime  a  planning 
process  begins.  The  apply  function  applies  the  operators  and  creates  copies  of 
the  state  of  the  world.  Depending  on  the  underlying  software  code,  these  changes 
might  be  easily  r avertible  or  not.  In  the  latter  case,  the  monitoring  agent  has  to 
keep  track  of  the  old  state  of  the  world. 
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2.3  Finite  Evaluability  of  ccc’s  and  Completeness  of  A-SHOP 

An  important  question  for  any  planning  edgorithm  is  whether  all  solution  plans 
produced  by  the  algorithm  are  correct  (i.e.,  soundness  of  the  algorithm)  and 
whether  the  algorithm  will  find  solutions  for  solvable  problems  (i.e.,  complete¬ 
ness  of  the  algorithm).  Soundness  and  completeness  proofs  of  classical  planners 
assume  that  the  preconditions  can  be  evaluated  relative  to  the  current  state.  In 
SHOP,  for  example,  the  state  is  accessed  to  test  whether  a  method  is  applicable, 
by  examining  whether  the  method’s  preconditions  axe  valid  in  the  current  state. 
Normally  it  is  easy  to  guarantee  the  ability  to  evaluate  preconditions,  because 
the  states  typically  are  lists  of  predicates  that  are  locally  accessible  to  the  plan¬ 
ner.  However,  if  these  lists  of  predicates  are  replaced  by  code  call  conditions, 
this  is  no  longer  the  case. 

We  mentioned  in  Subsection  1.3  the  condition  of  safeness  to  ensure  evalua¬ 
bility  of  a  code  call.  We  also  mentioned  that  an  evaluable  cc  does  not  need  to 
terminate.  Consider  the  code  call 

in(X,  math:  peg(25))  &: 

in(Y,  math:  square (X))  &;Y  <  2000, 

which  constitutes  all  numbers  that  are  less  than  2000  and  that  are  squares  of  an 
integer  greater  than  or  equal  to  25. 

Clearly,  over  the  integers  there  are  only  finitely  many  ground  substitutions 
that  cause  this  code  call  condition  to  be  true.  Furthermore,  this  code  call  condi¬ 
tion  is  safe.  However,  its  evaluation  may  never  terminate.  The  reason  for  this  is 
that  safety  requires  that  we  first  compute  the  set  of  all  integers  that  are  greater 
than  25,  leading  to  an  infinite  computation. 

Thus  in  general,  we  must  impose  some  restrictions  on  code  call  conditions  to 
ensure  that  they  are  finitely  evaluable.  This  is  precisely  what  the  condition  of 
strongly  safeness  ([6,3])  does  for  the  code-call  conditions.  Intuitively,  by  requiring 
that  the  code  call  condition  is  safe,  we  are  ensuring  that  it  is  executable  and  by 
requiring  that  it  is  strongly  safe,  we  are  ensuring  that  it  will  only  return  finitely 
many  answers. 

Note  that  the  problem  of  deciding  whether  an  arbitrary  code  call  execution 
terminates  is  undecidable  (and  so  is  the  problem  of  deciding  whether  a  code  call 
condition  x  holds  in  O).  Therefore  we  need  some  input  of  the  agent  designer  (or 
of  the  person  who  is  responsible  for  the  legacy  code  the  agent  is  built  upon).  The 
information  needed  is  stored  in  a  finiteness  table  (see  [6,3]).  This  information  is 
used  in  the  purely  syntactic  notion  of  strong  safeness.  It  is  a  compile-time  cheeky 
an  extension  of  the  well-known  (syntactic)  safety  condition  in  databases. 

Lemma  1  (Evaluating  Agentized  Operators).  Let  (AgentMeth  hxt)  an 
agentized  method,  O  a  state,  and  (Agent Op  h'  XaddXdei)  agentized  operator. 
If  the  precondition  x  strongly  safe  wrt.  the  variables  in  h,  the  problem  of 
deciding  whether  x  holds  in  O  can  be  algorithmically  solved.  If  the  add  and 
delete-lists  Xadd  o,nd  Xdei  strongly  safe  wrt.  the  variables  in  h' ,  the  problem 
of  applying  the  agentized  operator  to  O  can  be  algorithmically  solved. 
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Theorem  1  (Soundness,  Completeness).  Let  O  he  a  state  andV  be  a  collec¬ 
tion  of  agentized  methods  and  operators.  If  all  the  preconditions  in  the  agentized 
methods  and  add  and  delete-lists  in  the  agentized  operators  are  strongly  safe 
wrt.  the  respective  variables  in  the  heads,  then  A-SHOP  is  correct  and  complete. 

3  Probabilistic  Reasoning 

Up  to  now  our  framework  of  agent  programs  does  not  allow  us  to  re8LSon  about 
uncertain  information.  Consider  a  code  call  of  the  form  d:/(args).  This  code 
call  returns  a  set  of  objects.  If  an  object  o  is  returned  by  such  a  code  call,  then 
this  means  that  o  is  definitely  in  the  result  of  evaluating  d:/(args). 

However,  there  are  many  cases,  particularly  in  applications  involving  rea¬ 
soning  about  knowledge,  where  a  code  call  may  need  to  return  an  “uncertain” 
answer.  We  show  in  this  section  that  our  framework  can  be  easily  generalized  to 
deal  with  probabilistic  reasoning. 

Example  2  (Surveillance  Example).  Consider  a  surveillance  application  where 
there  are  hundreds  of  (identical)  surveillance  agents,  and  a  geographic  agent. 
The  data  types  associated  with  the  surveillance  and  geographic  agent  include 
the  standard  int , bool, real, string, file  data  types,  plus  those  shown  below: 


Surveillance  Agent  Geographic  Agent 


image: record  of 

map:t  quadtree; 

imageidifile; 

quadtree: record  of 

day:date; 

place  :string; 

time:  int; 

xcoord:int; 

location:  string 

ycoord:int; 

imagedb:  setof  image; 

pop:int 

nw,ne,sw,se:t  quadtree 

A  third  agent  may  well  merge  information  from  these  two  agents,  tracking  a 
sequence  of  surveillance  events. 

The  surv  agent  may  support  a  function  surv :  identify  {)  which  takes  as  input 
an  image  and  returns  as  output  the  set  of  all  identified  vehicles  in  it.  It  may  also 
support  a  function  called  surv :  turret {)  that  takes  as  input,  a  vehicle  id,  and 
returns  as  output,  the  type  of  gun-turret  it  has.  Likewise,  the  geo  agent  may 
support  a  function  geo :  getplnodeQ  which  takes  as  input  a  map  and  the  name 
of  a  place  and  returns  the  set  of  all  nodes  with  that  name  as  the  place-field,  a 
function  geo :  getxynodeQ  which  takes  as  input  a  map  and  the  coordinates  of 
a  place  and  returns  the  set  of  all  nodes  with  that  coordinate  as  the  node,  a 
function  called  geo :  rangeQ  that  t2ikes  as  input  a  map,  an  x,y  coordinate  pair, 
and  a  distance  r  and  returns  as  output,  the  set  of  all  nodes  in  the  map  (quadtree) 
that  are  within  r  units  of  location  {x,y). 

In  this  example,  surv:  identify {imsigei)  tries  to  identify  all  objects  in  a  given 
image — ^however,  it  is  well-known  that  image  identification  is  an  uncertain  task. 
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Some  objects  may  be  identified  with  100%  certainty,  while  in  other  cases,  it  may 
only  be  possible  to  say  it  is  either  a  T-72  tank  with  40-50%  probability,  or  a 
T-80  tank  with  50-60%  probability. 

Image  processing  algorithms  for  vehicle  surveillance  applications  that  return 
probabilistic  identifications  are  readily  available  (e.g.,  see  [7]  and  [8]). 


3.1  Probabilistic  Code  Calls 

The  first  step  to  extend  our  framework  is  to  introduce  the  notion  of  a  probabilistic 
code  call  Its  main  ingredient  is  a  random  variable. 

Definition  7  (Random  Variable  of  Type  r).  A  random  variable  of  type  r 
is  a  finite  set  RV  of  objects  of  type  r,  together  with  a  probability  distribution  p 
that  assigns  real  numbers  in  the  unit  interval  [0, 1]  to  members  o/RV  such  that 

^o&BVp{o)  <  1. 

It  is  important  to  note  that  in  classical  probability  theory  [9] ,  random  variables 
satisfy  the  stronger  requirement  that  Z'o^hvP(o)  =  1.  However,  in  many  real- 
life  situations,  a  probability  distribution  may  have  missing  pieces,  which  explains 
why  we  have  chosen  a  weaker  definition. 

Definitions  (Probabilistic  Code  Call  a:Kv/(di, . . . ,  dn))*  Suppose  the 
code  call  a:/(di, . . .  ,dn)  has  output  type  r.  The  probabilistic  code  call  associ¬ 
ated  with  a  :/(di, . . . ,  dn),  denoted  a  :Rv/(di, . . . ,  d„),  returns  a  set  of  random 
variables  of  type  r  when  executed. 

Example  3.  Consider  the  code  call  surv :  identify {imaigel).  This  code  call  may 
return  the  following  two  random  variables. 

({i72,  t80},  {(t72, 0.5),  (t80, 0.4)})  and  {{t60,  t84},  {(t60, 0.3),  {t84, 0.7)}) 

This  says  that  the  image  processing  algorithm  has  identified  two  objects  in  im- 
agel.  The  first  object  is  either  a  T-72  or  a  T-80  tank  with  50%  and  40%  proba¬ 
bility,  respectively,  while  the  second  object  is  either  a  T-60  or  a  T-84  tank  with 
30%  and  70%  probability  respectively. 

Probabilistic  code  calls  and  code  call  conditions  look  exactly  like  ordinary  code 
calls  and  code  call  conditions — however,  as  a  probabilistic  code  call  returns  a 
set  of  random  variables^  probabilistic  code  call  atoms  are  true  or  false  with  some 
probability. 

We  are  now  ready  to  generalize  the  notion  of  a  state  of  an  agent  to  its 
probabilistic  counterpart. 

Definition  9  (Probabilistic  State  of  an  Agent).  The  probabilistic  state  of 
an  agent  a  at  any  given  point  t  in  time,  denoted  0^{t),  consists  of  the  set  of  all 
instantiated  data  objects  and  random  variables  of  types  contained  in  Ta . 
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3.2  Conjunction  Strategy  and  Probabilistic  Agent  Programs 

The  next  step  is  to  define  the  satisfaction  relation  of  probabilistic  code  call 
conditions.  This  is  problematic  as  the  following  example  illustrates. 

Example  4-  Consider  the  probabilistic  code  call  condition 

in(X,  surv  :rv  identify {ima^gel))  &  in{al ,  surv  :kv  turret{X)). 

This  code  call  condition  attempts  to  find  all  vehicles  in  “imagel”  with  a  gun 
turret  of  t3^e  al.  Let  us  suppose  that  the  first  code  call  returns  just  one  random 
variable  specifying  that  imagel  contains  one  vehicle  which  is  either  a  T-72  (prob¬ 
ability  50%)  or  a  T-80  tank  (probability  40%).  When  this  random  variable  (X)  is 
passed  to  the  second  code  call,  it  returns  one  random  variable  with  two  values — 
al  with  probability  30%  and  a2  with  probability  65%.  What  is  the  probability 
that  the  code  call  condition  above  is  satisfied  by  a  particular  assignment  to  X? 
The  answer  to  this  question  depends  very  much  upon  the  knowledge  we  have  (if 
any)  about  the  dependencies  between  the  identification  of  a  tank  as  a  T-72  or 
a  T-80,  and  the  type  of  gun  turret  on  these.  For  instance,  if  we  know  that  all 
T-72’s  have  a2  type  turrets,  then  the  probability  of  the  conjunct  being  true  when 
X  is  a  T-72  tank  is  0.  On  the  other  hand,  it  may  be  that  the  turret  identification 
and  the  vehicle  identification  are  independent  for  T-80s — hence,  when  X  is  set 
to  T-80,  the  probability  of  the  conjunct  being  true  is  0.4  x  0.3  =  0.12. 

Therefore  the  probability  that  a  conjunction  is  true  depends  not  only  on  the 
probabilities  of  the  individual  conjuncts,  but  also  on  the  dependencies  between 
the  events  denoted  by  these  conjuncts. 

We  have  solved  this  problem  by  introducing  the  notion  of  a  probabilistic 
conjunction  strategy  (8)  to  capture  these  different  ways  of  computing  probabilities 
via  an  abstract  definition.  We  are  also  using  annotations  to  represent  probability 
intervals.  For  instance,  [0, 0.4],  [0.7, 0.9],  [0.1,  |],  [|,  |]  are  all  annotations.  The 
annotation  [0.1,  |]  denotes  an  interval  only  when  a  value  in  [0, 1]  is  assigned  to 
the  variable  V. 

Definition  10  (Annotated  Code  Call  Condition  x  •  ([ab?  312]}  If  X 
is  a  probabilistic  code  call  condition,  ^  is  a  conjunction  strategy,  and  [aii,ai2] 
is  an  annotation,  then  x  ■  ([abj  abj? is  an  annotated  code  call  condition. 
X  :  ([ail,  ab],  <8))  is  ground  if  there  are  no  variables  in  either  x  or  in  [aiijab]. 

Intuitively,  the  ground  annotated  code  call  condition  x  '  ([aii,  ai2],  <8))  says  that 
the  probability  of  x  being  true  (under  conjunction  strategy  (8))  lies  in  the  interval 
[aii,ai2].  For  example,  when  X  is  ground, 

in(X,  surv  :rv  identify (xmaigel))  k,  in(al,  surv  :rv  turretifk))  :  ([0.3, 0.5],  ®ig) 

is  true  if  and  only  if  the  probability  that  X  is  identified  by  the  surv  agent  and 
that  the  turret  is  identified  as  being  of  type  al  lies  between  30  and  50%  assuming 
that  nothing  is  known  about  the  dependencies  between  turret  identifications  and 
identifications  of  objects  by  surv. 

We  are  now  ready  to  define  the  concept  of  a  probabilistic  agent  program. 
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Definition  11  (Probabilistic  Agent  Programs  PP),  Suppose  F  is  an  an¬ 
notated  code  call  condition,  and  A, Li, . . .  are  status  atoms.  Then 

A^rkLi&:...kLn  (1) 

is  a  probabilistic  agent  rule.  For  such  a  rule  r,  we  use  to  denote  the 

positive  status  atoms  in  and  B~g{r)  to  denote  the  set  of  negative 

status  literals  in  {Li, ...  ,Ln}. 

A  probabilistic  agent  program  (pap  for  short)  is  a  finite  set  of  probabilistic 
agent  rules. 

Consider  an  intelligent  sensor  agent  that  is  performing  surveillance  tasks. 
The  following  rules  specify  a  small  pap  that  such  an  agent  might  use. 

Do  send Jwarn{X)  in(F,  surv:yi^e( image db))  k 

in(X,  surv  :rv  identify {¥))  k 
in(al,  surv:Rvitirrei(X)))  :  ([0.7,  l-O],  <8)ip) 
-^Fsendjwarn{X). 

Fsendjwarn{X)  ^  in(F,  surv :  ^/e(imagedb))  k 
in(X,  surv  :rv  identify (F))  k 
in(L,  geo  :rv  peip/node(X. location))  k 
in(L,  geo  :rv  range{100, 100, 20)). 

This  agent  operates  according  to  two  very  simple  rules.  The  first  rule  says  that 
it  sends  a  warning  whenever  it  identifies  an  enemy  vehicle  as  having  a  gun  turret 
of  type  al  with  over  70%  probability,  as  long  as  sending  such  a  warning  is  not 
forbidden.  The  second  rule  says  that  sending  a  warning  is  forbidden  if  the  enemy 
vehicle  is  within  20  units  of  distance  from  location  (100, 100). 

Defining  the  semantics  for  this  kind  of  programs  is  out  of  scope  of  this  paper 
and  we  refer  to  [1]. 

4  Serving  Requests  more  Efficiently 

With  the  increase  in  agent-based  applications,  there  are  now  agent  systems  that 
support  concurrent  client  accesses.  The  ability  to  process  large  volumes  of  si¬ 
multaneous  requests  is  critical  in  many  such  applications.  In  such  a  setting,  the 
traditional  approach  of  serving  these  requests  one  at  a  time  via  queues  (e.g.  FIFO 
queues,  priority  queues)  is  insufficient.  In  this  section  we  review  the  approach 
of  [4].  The  overall  idea  is  that  for  a  given  set  of  requests  one  needs  to 

1.  identify  commonalities  among  them.  This  information  can  be  used  to  sim¬ 
plify  the  set  and  merge  some  of  the  requests  together. 

2.  compute  a  single  global  execution  plan  that  simultaneously  optimizes  the 
total  expected  cost  of  this  set  of  code  call  conditions. 

Instead  of  sending  many  individual  requests  one  after  another,  sending  one  large 
merged  request  (the  answer  from  which  the  answers  to  the  original  requests  can 
be  deduced)  can  already  save  a  lot  of  network  time. 
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4.1  Invariants 

How  can  we  detect  commonalities?  Obviously,  we  need  input  from  the  agent  de¬ 
veloper.  In  our  framework,  an  agent  developer  specifies  several  parameters.  One 
of  these  parameters  must  include  some  domain-specific  information,  explicitly 
laying  out  what  inclusion  and  equality  relations  are  known  to  hold  of  code  calls. 
Such  information  is  specified  via  invariants.  An  important  ingredient  for  their 
definition  are  invariant  expressions. 

Definition  12  (Invariant  Expression). 

—  Every  evaluable  code  call  condition  is  an  invariant  expression.  We  call  such 
expressions  atomic. 

—  If  iei  and  ie2  are  invariant  expressions,  then  {iei  U  /e2)  and  {iei  D  162)  are 
invariant  expressions.  (We  will  often  omit  the  parentheses.) 

Example  5.  Two  examples  of  invariant  expressions  are: 

in(StudentRec,  rel :  select{courseRel,  exam,  "=”,  midterm  1))  k. 
in(C,  excel :  chart {excelFile,  StudentRec,  grade)) 


in(X,  spatial :  horizontalij ,  B,  U))  U  (in(Y,  spatial :  horizontal {T' ,  B', U'))  U 
in(Z,  spatial:  horizontal {T' ,B' ,\J))). 

What  is  the  meaning,  i.e.  the  denotation  of  such  expressions?  The  first  in¬ 
variant  represents  the  set  of  all  objects  c  such  that 

in(StudentRec,  rel :  select{courseRel,  exam,  midterml))  k 
in(c,  excel :  chart  {excelFile,  StudentRec,  grade)) 

holds:  we  are  looking  for  instantiations  of  C.  Note  that  under  this  viewpoint,  the 
intermediate  variable  StudentRec  which  is  needed  in  order  to  instantiate  C  to 
an  object  c  does  not  matter.  There  might  just  as  well  be  situations  where  we  are 
interested  in  pairs  (c,  studentrec)  instead  of  just  c. 

Definition  13  (Invariant  Condition  (ic)).  An  invariant  condition  atom  is  a 
statement  of  the  form  ti  Op  *2  where  Op  G  {<,>,<,>,  =}  and  each  ofti,  t2  is 
either  a  variable  or  a  constant.  An  invariant  condition  (IC)  is  defined  inductively 
as  follows: 

1.  Every  invariant  condition  atom  is  an  ic. 

2.  If  Cl  and  C2  are  ic’s,  then  Ci  A  C2  and  Ci  V  C2  are  ic’s. 


Definition  14  (Invariant  inv,  INV).  An  invariant,  denoted  by  inv,  is  a  state¬ 
ment  of  the  form 

iei  3^  ie2  (2) 


where 


1.  ic  is  an  invariant  condition,  all  variables  occuring  in  ic  are  among 
varbase{iei)  U  varbase{fe2) . 
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2.  ^  £  {=,  C},  and 

3.  /ei,/e2  are  invariant  expressions. 

If  ie\  and  /es  both  contain  solely  atomic  code  call  conditions,  then  we  say  that  inv 
is  a  simple  invariant  If  ic  is  a  conjunction  of  invariant  condition  atoms,  then 
we  say  that  inv  is  an  ordinary  invariant.  The  set  of  all  invariants  is  denoted  by 
INV. 

The  invariant, 

Rel  =  Rel'  A  Attr  =  Attr'  A  Op  =  Op'  =  "<"  A  Val  <  Val' 

in(X,  rel :  select{Rel,  Attr,  Op,  Val))  C  in(Y,  rel :  select{Kel',  Attr',  Op',  Val')) 

says  that  the  code  call  condition  in(X,  rel :  select  {Rel,  Attr,  Op,  Val))  can  be  eval¬ 
uated  by  using  the  results  of  the  ccc  in(Y,  rel:5eZeci(Rel',Attr',0p',Val'))  if 
the  above  conditions  are  satisfied.  Note  that  this  expresses  semantic  informa¬ 
tion  that  is  not  available  on  the  syntactic  level:  the  operator  •'<”  is  related  to 
the  relation  symbol 


4.2  Merging  Requests 

Let  us  suppose  now  that  we  have  a  set  X  of  invariants,  and  a  set  S  of  data 
structures  that  are  manipulated  by  the  agent.  How  exactly  should  a  set  C  of 
code  call  conditions  be  merged  together?  And  what  needs  to  be  done  to  support 
this?  Our  architecture  contains  two  parts: 

(i)  a  development  time  phase  stating  what  the  agent  developer  must  specify 
when  building  her  agent,  and  what  algorithms  are  used  to  operate  on  that 
specification,  and 

(ii)  a  deployment  time  phase  which  specifies  how  the  above  development-time 
specifications  are  used  when  the  agent  is  in  fact  running  autonomously. 


Development  Time  Phase.  When  the  agent  developer  builds  her  agent,  the 
following  things  need  to  be  done. 

1.  First,  the  agent  developer  specifies  a  set  T  of  invariants. 

2.  Suppose  C  is  a  set  of  CCCs  to  be  evaluated  by  the  agent.  Each  code  call 
condition  x  ^  C  is  represented  via  an  evaluable  cceg  (see  Figure  3  in  Sub¬ 
section  1.3).  Let  INS{C)  represent  the  set  of  all  nodes  in  ccegs  of  xs  in  C: 

INS{C)  =  {tii  I  3x  €  C  s.t.  Vi  is  in  x's  cceg}. 

This  can  be  done  by  a  topological  sort  of  the  cceg  for  each  x  ^ 

3.  Additional  invariants  can  be  derived  from  the  initial  set  X  of  invariants.  This 
requires  the  ability  to  check  whether  a  set  X  of  invariants  implies  an  inclusion 
relationship  between  two  invariant  expressions.  Although  we  have  defined  a 
formally  precise  notion  of  a  set  of  invariants  implying  other  invariants  we 
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will  provide  a  generic  test  called  Chk_Imp  for  implication  checking  between 
invariants.  There  are  various  instances  of  Chk-Imp  that  are  sound  but  not 
complete,  thereby  allowing  us  to  specify  various  parameters  and  heuristics. 
Given  an  arbitrary  (but  fixed)  Chk  Jmp  test,  we  will  provide  an  algorithm 
called  Compute-Derived-Invariants  that  calculates  the  set  of  derivable 
invariants  from  the  initial  set  T  of  invariants  and  needs  to  be  executed  just 
once. 


Deployment  Time  Phase.  Once  the  agent  has  been  “developed”  and  deployed 
and  is  running,  it  will  need  to  continuously  determine  how  to  merge  a  set  C  of 
code  call  conditions.  This  will  be  done  as  follows: 

1.  The  system  identifies  three  types  of  relationships  between  nodes  in  INS{C). 
Identical  ccc’s:  First,  we’d  like  to  identify  nodes  Xi>X2  ^  INS{C)  which 
are  “equivalent”  to  one  another,  i.e.  Xi  =  X2  is  a  logical  consequence 
of  the  set  of  invariants  J.  This  requires  a  definition  of  equivalence  of 
two  code  call  conditions  w.r.t.  a  set  of  invariants.  This  strategy  is  useful 
because  we  can  replace  the  two  nodes  xi?  X2  by  a  single  node.  This  avoids 
redundant  computation  of  both  xi  and  X2- 
Implied  ccc’s:  Second,  we’d  like  to  identify  nodes  Xi?X2  ^  INS{C)  which 
are  not  equivalent  in  the  above  sense,  but  such  that  either  Xi  Q  X2  or 
X2  ^  Xi  hold,  but  not  both.  Suppose  xi  ^  X2-  Then  we  can  compute 
X2  first,  and  then  compute  xi  from  the  answer  returned  by  computing 
X2-  This  way  of  computing  XI5X2  rnay  be  faster  than  computing  them 
separately. 

Overlapping  ccc’s:  Third,  we’d  like  to  identify  nodes  XI5X2  ^  INS{C)  for 
which  the  preceding  two  conditions  do  not  hold,  but  xi  ^  X2  is  consistent 
with  INS(C).  In  this  case,  we  might  be  able  to  compute  the  answer  to 
Xi  V  X2-  From  the  answer  to  this,  we  may  compute  the  answer  to  xi 
and  the  answer  to  X2-  This  way  of  computing  XI5  X2  may  be  faster  than 
computing  them  separately. 

We  will  provide  an  algorithm,  namely  Improved-CSI,  which  will  use  the  set 
of  derived  invariants  returned  by  the  Compute-Derived-Invariants  algo¬ 
rithm  above,  to  detect  commonalities  (equivalent,  implied  and  overlapping 
code  call  conditions)  among  members  of  C. 

Example  6.  The  two  code  call  conditions  in(X,  spatial :  vertical{T,  L,  R))  and 
in(y,  spatial:  vertical are  equivalent  to  one  another  if  their  argu¬ 
ments  are  unifiable.  The  results  of  evaluating  the  code  call  condition 

in(Z,  spatial:  range{T,  40,  50, 25)) 
is  a  subset  of  the  results  of  evaluating  the  code  call  condition 


in(W,  spatial:  ran^e (T', 40, 50,  50)) 
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if  T  =  T'.  Note  that  spatial :  rangeij^  X,  Y,  Z)  returns  all  points  in  T  that  are 
Z  units  away  from  the  point  (X,  Y).  In  this  case,  we  can  compute  the  results 
of  the  former  code  call  condition  by  executing  a  selection  on  the  results  of 
the  latter  rather  than  executing  the  former  from  scratch.  Finally,  consider 
the  following  two  code  call  conditions: 

in(X,  spatial:  /ion;2:onia/( map,  100, 200)), 
in(Y,  spatial :  horizontal  {map,  150, 250)). 

Here  spatial:  horizontal  {map,  a,b)  returns  all  points  (X,  Y)  in  map  such  that 
a  <Y  <b.  Obviously,  the  results  of  neither  of  these  two  code  call  conditions 
are  subset  of  the  results  of  the  other.  However,  the  results  of  these  two  code 
call  conditions  overlap  with  one  another.  In  this  case,  we  can  execute  the 
code  call  condition  in(Z,  spatial:  /iomon^a/( mop,  100,250)).  Then,  we  can 
compute  the  results  of  the  two  code  call  conditions  by  executing  selections 
on  the  results  of  this  code  call  condition. 

2.  We  will  then  provide  two  procedures  to  merge  sets  of  code  call  conditions, 
BFMerge  and  DFMerge,  that  take  as  input,  (i)  the  set  C  and  (ii)  the  out¬ 
put  of  the  Improved-CSI  algorithm  above,  and  (Hi)  a  cost  model  for  agent 
code  call  condition  evaluations.  Both  these  algorithms  are  parameterized  by 
heuristics  and  we  propose  three  alternative  heuristics.  Then  we  evaluate  our 
six  implementations  (3  heuristics  times  2  algorithms)  and  also  compare  it 
with  an  A*  based  approach. 

For  an  implementation,  we  implemented  both  these  algorithms  on  top  of  the 
IMPACT  agent  development  platform,  and  on  top  of  a  {non-IMPACT)  geo¬ 
graphic  database  agent. 

4.3  Results 

Development  Phase.  The  definition  of  a  sound  and  complete  instance  of 
ChkJmp  is  based  on  the  definition  of  a  certain  monotone  fixpoint  operator, 
the  least  fixedpoint  of  which  constitutes  the  set  of  implied  invariants  ([4]).  Com¬ 
pleteness  is  proved  by  reducing  the  problem  to  the  completeness  of  a  particular 
first-order  calculus  and  using  a  Henkin-like  construction. 

Proposition  1  (co-NP  Completeness  of  Checking  Implication), 

Suppose  all  datatypes  have  a  finite  domain  (i.e.  each  datatype  has  only  finitely 
many  values  of  that  datatype).  Then  the  problem  of  checking  whether  an  arbi¬ 
trary  invariant  expression  iei  implies  another  invariant  expression  ie2  is  co-NP 
complete.  The  same  holds  for  the  problem  of  checking  whether  an  invariant  is  a 
tautology. 

We  have  therefore  studied  the  tradeoffs  involved  in  using  sound,  but  perhaps 
incomplete  implementations  of  implication  checking. 

There  are  clearly  many  ways  of  implementing  the  algorithm  ChkJmp  that 
are  sound,  but  not  complete.  We  considered  a  generic  algorithm  to  implement 
ChkJmp,  where  the  complexity  can  be  controlled  by  two  input  parameters — an 
axiomatic  inference  system  and  a  threshold. 
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Deployment  Phase.  We  developed  two  algorithms  (and  various  accompanying 
heuristics)  which  allow  an  agent  to  automatically  rewrite  requests  so  as  to  avoid 
redundant  work — these  algorithms  take  invariants  associated  with  the  agent  into 
account.  Our  algorithms  are  independent  of  any  specific  agent  framework.  For 
an  implementation,  we  implemented  both  these  algorithms  on  top  of  the  IM¬ 
PACT  agent  development  platform,  and  on  top  of  a  (non-IMPACT)  geographic 
database  agent.  Based  on  these  implementations,  we  conducted  experiments  and 
show  that  our  algorithms  are  considerably  more  efficient  than  methods  based  on 
the  well  known  algorithm  in  [10]  for  merging  multiple  relational  database  only 
queries  using  the  A*  algorithm.  Our  experiments  show  that  although  the  A* 
algorithm  finds  better  global  results,  the  cost  of  obtaining  those  results  is  so 
prohibitively  high  that  the  A*  is  often  infeasible  to  use  in  practice. 

Figure  6  shows,  that  the  execution  time  for  determining  overlapping  code 
calls  still  is  below  one  second  for  a  set  of  20  ccc’s.  Similar  times  are  obtained 
for  equivalent  and  implied  ccc’s.  We  also  noted  that  there  are  often  more  ccc’s 
falling  in  the  implied  or  overlapping  categories,  than  in  the  equivalence  category. 
As  the  methods  based  on  the  A*  algorithm  only  searches  for  the  latter  category, 
our  optimizations  pay  off. 

Although  the  A*  algorithm  finds  better  global  results,  the  cost  of  obtaining 
those  results  is  so  prohibitively  high  that  the  A*  is  often  infeasible  to  use  in 
practice.  We  have  also  shown  that  our  merging  algorithms  (1)  can  handle  more 
than  twice  as  many  simultaneous  code  call  conditions  as  the  A*  algorithm  and  (2) 
run  100  to  6300  times  faster  than  the  A*  algorithm  and  (3)  produce  execution 
plans  the  cost  of  which  is  at  most  10%  more  than  the  plans  generated  by  the 
A*  algorithm. 


npotCCCa 


Fig.  6.  Execution  Time  of  Merge  Algorithms  with  overlapping  ccc  Sets 
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5  Conclusion 

We  have  illustrated  three  powerful  extensions  to  the  basic  IMPACT  frame¬ 
work:  incorporating  planningy  incorporating  uncertain  reasoning  and  optimizing 
queries  sent  over  a  network.  While  the  first  extension  required  an  agentization 
procedure  to  incorporate  an  efficient  HTN  planner  into  IMPACT,  the  second  ex¬ 
tension  extended  the  notion  of  a  code  call  to  one  dealing  with  random  variables 
and  required  heavy  use  nf  annotated  logic  programming.  The  third  extension 
required  fixpoint  techniques  and  automated  reasoning  mechanisms  (to  prove  the 
completeness  result). 

The  semantics  of  the  basic  framework  as  well  as  of  the  extensions  described 
are  based  on  the  notion  of  an  agent  program  and  thus  are  very  much  related 
to  nonmonotonic  formalisms  like  the  stable  and  wellfounded  semantics.  We  can 
conclude  that  any  formal  approach  to  heterogenous  agent  systems  can  benefit  a 
lot  from  Computational  Logic,  to  which  all  the  above  techniques  belong. 
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Abstract.  Lixto  is  a  system  and  method  for  the  visual  and  interactive 
generation  of  wrappers  for  Web  pages  under  the  supervision  of  a  human 
developer,  for  automatically  extracting  information  from  Web  pages  us¬ 
ing  such  wrappers,  and  for  translating  the  extracted  content  into  XML. 
This  paper  describes  some  advanced  features  of  Lixto,  such  as  disjunctive 
pattern  definitions,  specialization  rules,  and  Lixto^s  capability  of  collect¬ 
ing  and  aggregating  information  from  several  linked  Web  pages. 


1  Introduction  and  Motivation 

Extracting  relevant  information  automatically  from  HTML  Web  pages  of  chang¬ 
ing  content,  and  converting  the  extracted  information  to  a  structured  repre¬ 
sentation  is  an  important  problem,  to  which  a  lot  of  research  heis  been  ded¬ 
icated  [3,7,8,10,11,13,14].  XML  was  designed  to  enrich  the  semantics  of  Web 
information  [1,6].  Even  if  in  some  respects  XML  may  not  yet  fulfill  this  goal  per¬ 
fectly,  XML  appears  to  be  the  right  representation  format  for  the  information 
extracted  from  HTML.  Programs  that  perform  such  extraction  and  translation 
tasks  are  referred  to  as  wrappers.  Wrappers  can  be  hand-coded,  e.g.  in  spe¬ 
cialized  languages  such  as  Jedi  [9]  or  Florid  [12],  or  they  can  be  produced  via 
wrapper  generators.  Wrapper  generators  are  software  tools  that  generate  wrap¬ 
pers  via  induction  (such  as  e.g.  [2,10,13])  or  that  semi-automatically  support 
the  generation  of  wrappers  via  an  interactive  process  supervised  by  a  human 
designer  ([11,14]).  Wrapper  generators  support  the  task  of  reverse  engineering, 
as  the  goal  of  a  wrapper  is  to  reverse  the  processing  of  dynamic  Web  sites  that 
generate  HTML  starting  from  an  internal  structured  representation  (such  as  a 
relational  database). 

In  a  recent  paper  [5]  we  introduced  Lixto,  a  new  method  and  system  for 
visually  generating  HTML/XML  wrappers  under  the  supervision  of  a  human 
designer.  Lixto  allows  a  wrapper  designer  to  interactively  and  visually  define 
information  extraction  patterns  on  the  base  of  visualized  sample  Web  pages. 
These  extraction  patterns  are  collected  into  a  hierarchical  knowledge  base  that 

*  All  new  methods  and  algorithms  of  the  Lixto  system  are  covered  by  a  pending  patent. 

Future  developments  of  Lixto  will  be  reported  at  www.lixto.com. 
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constitutes  a  declarative  wrapper  program.  The  extraction  knowledge  is  inter¬ 
nally  represented  in  a  datalog  like  special-purpose  logic  programming  language, 
called  Elog.  However,  a  user  of  Lixto  is  not  concerned  with  the  syntax  of  Elog 
and  does  not  need  to  learn  this  language  as  she  constructs  an  Elog  wrapper 
program  by  purely  visual  and  interactive  primitives  without  ever  seeing  the  re¬ 
sulting  Elog  program.  Wrapper  programs  in  Elog  can  be  directly  executed  over 
input  Web  sites  by  an  extractor  module  that  interprets  the  Elog  rules  taking 
care  of  the  evaluation  of  special  built-in  predicates.  Lixto  also  allows  a  designer 
to  define  XML  translation  rules  that  specify  how  extracted  content  should  be 
translated  into  XML,  a  so-called  XML  translation  scheme.  An  XML  translation 
scheme  together  with  extraction  pattern  definitions  (the  Elog  program)  in  addi¬ 
tion  enables  the  system  to  construct  a  Document  Type  Definition  (DTD)  which 
describes  the  characteristics  of  the  output  XML  documents. 

The  advantages  of  the  Lixto  wrapper  generator  over  competing  approaches 
are  mainly  the  following.  (1)  Very  high  expressive  power,  i.e.,  an  unprecedented 
capability  of  defining  sophisticated  extraction  patterns.  (2)  Excellent  visual  sup¬ 
port:  The  wrapper  designer’s  sole  view  of  an  example  HTML  document  is  the 
browser-displayed  standard  image  of  the  document  (no  annotations,  overlays, 
HTML-sources  or  DOM  trees)  and  the  wrapper  designer  uses  directly  this  dis¬ 
play  for  marking  extraction  patterns.  (3)  Good  leamability,  because  no  extraction 
language  needs  to  be  learned  and  neither  HTML  nor  XML  knowledge  is  neces¬ 
sary.  (4)  Sample  parsimony,  which  means  that  very  few  sample  pages  (in  most 
cases  a  single  one)  are  needed  in  order  to  define  robust  wrappers  for  large  classes 
of  Web  pages.  X  (5)  simple  and  smooth  XML  translation  mechanism  that  gives 
a  designer  several  options  for  formatting  or  modifying  the  XML  output. 

Basic  features  of  Lixto  are  described  in  [4,5],  where  also  a  comparison  to 
related  research  is  given.  The  main  goal  of  the  present  paper  is  to  introduce  and 
illustrate  some  of  the  more  advanced  features  of  the  Elog  language.  All  the  pre¬ 
sented  advanced  features  can  be  visually  created  by  using  Lixto  without  knowing 
Elog.  Details  of  the  visual  interface  and  the  way  of  creating  patterns  can  be  found 
in  [4]  and  [5] ,  where  a  precise  description  of  the  pattern  generation  algorithm  is 
given.  There,  these  details  are  discussed  for  a  restricted  environment  w.r.t.  some 
advanced  concepts  discussed  in  this  paper,  but  a  quite  similar  approach  can  be 
used  for  these  advanced  features.  The  present  paper  is  self-contained  at  the  level 
of  general  description,  but  not  at  the  level  of  details.  For  the  latter,  we  refer 
to  [5]. 

Among  the  advanced  features  we  discuss  here  are  disjunctive  wrapping,  i.e., 
defining  one  pattern  through  several  alternative  definitions;  pattern  specializa¬ 
tion,  i.e.,  defining  a  new  pattern  by  restricting  another  pattern;  interactively 
defining  new  document  patterns,  which  are  patterns  corresponding  to  entire  doc¬ 
uments  that  aie  identified  via  extracted  URLs;  Web  crawling,  which,  in  this  con¬ 
text,  means  that  a  pattern  hierarchy  is  built  that  aggregates  information  from 
various  Web  pages  by  starting  at  a  given  input  page  and  automatically  following 
URLs  to  other  pages;  and  recursive  wrapping  which  means  that  recursive  pat¬ 
tern  structures  (akin  to  recursive  data  types)  can  be  constructed  that  allow  the 
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system  to  crawl  to  an  indefinite  number  of  Web  pages  and  extract  information 
from  all  these  pages.  We  will  also  discuss  some  interesting  nonmonotonic  issues 
such  as  pattern  minimization  principles  and  the  semantics  of  range  restrictions. 
Moreover,  this  paper  introduces  pattern  graphs  for  describing  the  structure  of  the 
pattern  hierarchy  interactively  defined  by  a  designer  (see  Figures  3,4,6,  and  7). 
Note  that  pattern  graphs  for  simple  extraction  tasks  are  trees,  which  means  that 
there  is  a  strict  pattern  hierarchy.  When  disjunctive  pattern  definitions  are  used, 
then  the  corresponding  pattern  graphs  are  dags,  while  with  recursive  wrapping 
they  are  cyclic  graphs. 

The  paper  is  structured  as  follows.  In  the  next  two  sections  we  give  an 
overview  of  Lixto  and  a  description  of  the  basic  features  of  the  Elog  language. 
Section  4  gives  a  closer  look  on  some  features.  In  Section  5  we  illustrate  the 
power  of  disjunctive  pattern  descriptions,  whereas  in  Section  6  some  light  is 
shed  on  Elog^s  aspects  concerning  link  crawling  and  recursion.  These  sections 
introduce  advanced  features  of  the  internal  language  of  Lixto  both  with  an  ab¬ 
stract  description  and  examples  from  the  commercial  domain.  Section  7  discusses 
various  nonmonotonic  aspects  of  Lixto  such  as  minimization,  range  conditions, 
and  further  recursive  aspects  introduced  by  pattern  references. 

2  Pattern  Generation  with  Lixto 

Architecture,  The  Lixto  prototype  consists  of  two  main  blocks:  The  Wrapper 
Generator  and  the  Program  Evaluator.  One  module  of  the  wrapper  generator, 
the  Interactive  Pattern  Builder,  allows  a  wrapper  designer  to  create  and  to  store 
a  wrapper  in  form  of  an  extraction  program  (a  program  in  the  language  Elog). 
Moreover,  the  wrapper  generator  contains  the  XML  Translation  Builder  that  al¬ 
lows  a  designer  to  specify  how  extracted  data  should  be  translated  into  XML  for¬ 
mat  and  to  store  such  a  specification  in  form  of  an  XML  translation  scheme.  The 
program  evaluator  automatically  executes  an  extraction  program  (performed  by 
the  Extractor  module)  and  a  corresponding  XML  translation  scheme  (performed 
by  the  XML  translator  module)  over  Web  pages  by  extracting  data  from  them 
and  translating  the  extracted  data  into  XML  format.  (For  details  see  [5].) 

Extraction  Patterns.  A  wrapper  is  constructed  by  formalizing,  collecting, 
and  storing  the  knowledge  about  desired  extraction  patterns.  Extraction  pat¬ 
terns  describe  single  data  items  or  chunks  of  coherent  data  to  be  extracted  from 
Web  pages  by  their  locations  and  by  their  characteristic  internal  or  contextual 
properties.  Extraction  patterns  are  generated  and  refined  interactively  and  semi- 
automatically  with  help  of  a  human  wrapper  designer.  They  are  constructed  in 
a  hierarchical  fashion  on  sample  pages  by  marking  relevant  items  or  regions  via 
mouse  clicks  or  similar  actions,  by  menu  selections,  and/or  by  simple  textual 
inputs  to  the  user  interface.  A  wrapper,  in  our  approach,  is  thus  a  knowledge 
base  consisting  of  a  set  of  extraction  patterns. 

While  patterns  are  descriptions  of  data  to  be  extracted,  pattern  instances 
are  concrete  data  elements  on  Web  pages  that  match  such  descriptions,  and 
hence  are  extracted.  Lixto  distinguishes  different  types  of  patterns:  Tree,  string, 
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and  document  patterns.  Tree  patterns  serve  to  extract  parts  of  documents  cor¬ 
responding  to  tree  regions,  i.e.,  to  subtrees  of  their  parse  tree.  String  patterns 
serve  to  extract  textual  strings  from  visible  and  invisible  parts  of  a  document  (an 
invisible  paxt  could  be,  e.g.,  an  attribute  value  such  as  the  name  of  an  image). 
Document  patterns  are  used  for  navigating  to  further  Web  pages. 

Logical  OTganization  of  Patterns.  The  logical  organization  of  an  extraction 
pattern  is  as  follows:  each  extraction  pattern  has  a  name  and  contains  one  or 
more  so-called  filters.  Each  filter  provides  an  alternative  definition  of  data  to  be 
extracted  and  to  be  associated  with  the  pattern.  The  set  of  filters  of  a  pattern  is 
interpreted  disjunctively  (i.e.,  connected  by  logical  ORs).  Each  filter  is  associated 
to  a  parent  pattern  from  which  it  extracts  the  desired  information.  Tree  (string) 
patterns  are  specified  via  tree  (string)  filters. 

A  tree  filter  contains  a  representation  of  a  generalized  parse  tree  path  that 
matches  a  set  of  items  on  a  Web  page,  and  contains  a  set  of  conditions  that  these 
items  must  satisfy.  All  the  conditions  of  a  filter  are  interpreted  conjunctively,  i.e., 
an  element  of  a  Web  page  satisfies  a  filter  if  and  only  if  it  matches  its  generalized 
tree  path  and  satisfies  all  the  conditions  of  the  filter.  Similarly,  a  string  filter 
specifies  the  characteristics  of  the  text  to  be  extracted  (using  a  formal  language), 
and  possibly  further  conditions. 

Liooto  offers  a  wrapper  designer  the  possibility  to  express  various  types  of 
conditions  restricting  the  intended  pattern  instances  of  a  filter.  The  main  types 
of  conditions  are  inherent  (internal)  conditions,  contextual  (external)  conditions, 
and  range  conditions.  In  addition  to  these  three  basic  types  of  conditions,  Lixto 
allows  a  designer  to  express  auxiliary  conditions  like  pattern  reference  conditions, 
concept  conditions  and  comparison  conditions.  They  are  discussed  as  atoms  of 
the  Elog  language  in  more  detail  in  Section  3. 

Visual  Pattern  Generation.  Extraction  patterns  are  defined  by  the  designer  in 
a  hierarchical  manner.  A  pattern  that  describes  an  entire  document  is  referred  to 
as  a  document  pattern.  In  particular,  the  document  pattern  corresponding  to  the 
starting  Web  page,  the  so-called  “home  document  pattern” ,  is  available  £is  a  pre¬ 
existing  pattern.  Other  patterns  are  defined  interactively.  Filters  or  patterns  are 
usually  defined  in  the  context  of  other  patterns  (so-called  parent  patterns).  For 
example,  a  pattern  <naiiie>  may  be  defined  first,  and  then  patterns  <f  irstname> 
and  <familyname>,  etc.,  may  be  defined  in  the  context  of  the  source  pattern 
<name>.  For  the  majority  of  common  extraction  tasks,  defining  flat  patterns 
or  a  strict  hierarchy  of  patterns  will  in  practice  be  sufficient.  However,  Lixto 
does  not  limit  the  pattern  definition  to  be  strictly  hierarchical  (i.e.  tree-like). 
Moreover,  pattern  definitions  are  allowed  to  be  recursive  (similar  to  recursive 
type  definitions  in  programming  languages).  While  patterns  are  not  required 
to  form  a  strict  hierarchy,  pattern  instances  do  always  form  one  and  can  be 
arranged  as  a  tree  (or  forest,  in  case  they  stem  from  different  documents,  which 
can  be  the  case  in  recursive  programs  as  explained  in  Section  6). 

The  visual  and  interactive  pattern  definition  method  allows  a  wrapper  de¬ 
signer  to  define  an  extraction  program  and  an  associated  XML  translation 
scheme  without  any  programming  efforts.  The  Lixto  Interactive  Pattern  Builder 
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allows  a  wrapper  designer  to  define  filters  and  patterns  with  the  help  of  one  or 
more  characteristic  example  pages,  and  to  modify  and  store  patterns.  At  various 
intermediate  steps,  the  designer  may  test  a  partially  or  fully  constructed  filter 
or  pattern,  both  on  the  example  pages  used  to  construct  the  pattern  as  well  as 
on  any  other  Web  page.  The  result  of  such  a  test  is  a  set  of  pattern  instances, 
which  is  displayed  by  a  browser  as  a  set  of  highlighted  items. 

The  filter  description  procedure  for  tree-filters  can  be  described  as  follows: 
The  designer  marks  an  initial  element  on  an  example  Web  page  (for  example, 
a  table).  The  system  associates  with  this  element  a  generalized  tree  path  of  the 
parse  tree  that  (possibly)  corresponds  to  several  similar  items  (for  example,  sev¬ 
eral  tables).  The  designer  then  tests  the  filter  for  the  first  time.  If  more  than  just 
the  intended  data  items  are  extracted  (and  thus  highlighted)  as  a  result  of  the 
^  test,  then  the  designer  adds  restrictive  conditions  to  the  filter  and  tests  the  filter 
again.  This  process  is  repeated  as  long  as  undesired  data  items  are  extracted.  At 
the  end  of  the  process,  the  filter  extracts  only  desired  items.  A  similar  procedure 
is  used  for  designing  string  filters.  However,  for  creating  a  string  rule  usually  no 
example  is  selected,  but  some  characterizations  are  visually  composed,  e.g.  by 
relying  on  concept  conditions.  A  pattern  is  designed  by  initially  asserting  one 
filter  for  the  pattern,  and,  in  case  this  is  not  sufiicient  (because  testing  shows 
that  not  all  intended  extraction  items  on  the  test  pages  are  covered),  by  asserting 
successively  more  filters  for  the  pattern  under  construction,  until  each  intended 
extraction  item  is  covered  by  at  least  one  filter  associated  to  that  pattern. 

Observe  that  the  methods  of  filter  construction  and  pattern  construction 
correspond  to  methods  of  definition-narrowing  and  definition-broadening  that 
match  the  conjunctive  and  disjunctive  nature  of  filters  and  patterns,  respectively. 
It  is  the  responsibility  of  the  wrapper  designer  to  perform  sufficient  testing,  and  - 
if  required  by  the  particular  application-test  filters  and  patterns  also  on  Web 
pages  different  from  the  initially  chosen  example  pages.  Moreover,  it  is  up  to  the 
wrapper  designer  to  choose  suitable  conditions  that  will  work  not  only  on  the 
test  pages,  but  also  on  all  other  target  Web  pages. 

The  visual  and  interactive  support  for  pattern  building  offered  by  Lixto  also 
includes  specific  support  for  the  hierarchical  organization  of  patterns  and  filters. 
A  wrapper  definition  process  according  to  Lixto  (and  consequently,  a  Lixto  wrap¬ 
per)  is  not  limited  to  a  single  sample  Web  document,  and  not  even  to  sample  Web 
pages  of  the  same  type  or  structure.  During  wrapper  definition,  a  designer  may 
move  to  other  sample  Web  pages  (i.e.,  load  them  into  the  browser),  continuing 
the  wrapper  definition  there. 

XML  Translation.  The  XML  Translation  Builder  which  constitutes  another 
interactive  module  of  the  wrapper  generator,  is  responsible  for  supporting  a 
wrapper  designer  during  the  generation  of  the  XML  translation  scheme.  By  de¬ 
fault,  pattern  names  are  used  as  output  XML  tags  and  the  hierarchy  of  extracted 
pattern  instances  determines  the  structure  of  the  output  XML  document.  Thus, 
in  case  no  specific  action  is  taken  by  the  designer,  the  pattern  instances  are 
translated  into  XML  in  a  standard  way  without  any  need  of  further  interac¬ 
tion.  However,  Lixto  also  offers  the  wrapper  designer  the  option  to  modify  the 
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standard  XML  translation  in  the  various  ways:  Renaming  patterns,  suppressing 
auxiliary  patterns,  writing  some  HTML  attributes,  and  deciding  whether  in¬ 
stances  of  document  patterns  are  all  treated  at  the  same  level,  or  hierarchically 
ordered  as  defined  by  the  extraction  process.  Moreover,  to  define  a  DTD  based 
on  an  output,  a  wrapper  designer  can  assign  a  multiplicity  to  each  pattern,  i.e.  if 
one  or  several  instances  are  required/allowed  to  occur  within  a  parent  pattern. 

These  desired  modalities  of  the  XML  translation  are  determined  during  the 
wrapper  design  process  by  a  very  simple  and  user-friendly  graphical  interface  and 
are  stored  in  the  form  of  an  XML  translation  scheme  that  encodes  the  mapping 
between  extraction  patterns  and  the  XML  output  in  a  suitable  form. 

3  An  Overview  of  the  Elog  Extraction  Language 

As  mentioned  in  the  previous  sections,  patterns  are  internally  represented  us¬ 
ing  the  declarative  extraction  language  Elog.  The  Elog  language  is  specifically 
designed  for  hierarchical  and  modular  data  extraction  and  it  is  ideally  suited 
for  representing  and  successively  incrementing  the  knowledge  about  extraction 
patterns.  It  uses  a  datalog-like  syntax  and  semantics,  enriched  with  several  pre¬ 
defined  predicates  related  to  information  extraction.  An  Elog  program  is  a  col¬ 
lection  of  rules  containing  special  extraction  atoms  in  their  bodies. 

We  illustrate  the  main  characteristics  of  Elog  using  an  example  program 
which  can  be  applied  to  eBay  pages,  e.g.  to  the  sample  page  in  Figure  1.  Figure  2 
shows  an  Elog  program  applied  to  a  category  search  result  page  of  eBay.  In 
the  following  examples,  we  additionally  use  a  pattern  graph  to  represent  a  Lixto 
wrapper.  A  pattern  graph  is  a  directed  graph  whose  nodes  represent  patterns  and 
an  arc  from  a  pattern  p2  to  a  pattern  pi  specifies  that  there  is  a  filter  defining  p2 
that  extracts  information  from  instances  of  pi.  Moreover,  document,  tree,  and 
string  patterns  are  represented  using  different  shapes.  Finally,  it  is  possible  to 
represent  also  information  about  the  XML  translation  scheme  using  this  graph. 
In  particular,  we  specify  that  a  pattern  is  translated  to  an  XML  element  by 
writing  a  text  “pattern  name/elementname”  into  the  pattern  node.  If  the  element 
name  is  missing,  then  the  pattern  name  is  used  as  default  translation.  The  set 
of  included  attributes  are  embedded  in  a  list,  e.g.  “[url,  font]”,  and  patterns 
that  are  not  translated  are  drawn  with  dashed  lines.  It  is  possible  to  specify  a 
minimum  and  maximum  multiplicity  on  the  arcs  (  “[min, max]”,  to  specify  the 
information  used  in  the  construction  of  the  DTD  (see  the  end  of  this  section). 
When  no  multiplicity  of  a  pattern  is  explicitly  indicated  in  the  pattern  graph, 
then  a  minimum  and  maximum  multiplicity  of  1  for  that  pattern  are  ajssumed. 
The  pattern  graph  of  the  program  in  Figure  2  is  shown  in  Figure  3.  In  this  case, 
as  all  filters  of  one  pattern  point  to  the  same  parent,  it  forms  a  tree. 

An  extraction  program  consists  of  a  set  of  patterns.  In  Elog^  a  pattern  p  is 
represented  by  a  set  of  rules  having  all  the  same  head  atom  of  the  form  p{Sj  X). 
Elog  rules  define  elements  to  be  extracted  from  Web  pages.  Each  rule  corresponds 
to  one  filter.  The  head  of  an  Elog  rule  r  is  always  of  the  form  p{S,X)  where  p 
is  a  pattern  name,  5  is  a  variable  which  is  bound  in  the  body  of  the  rule  to  the 
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parent-pattern  instances  of  the  filter  corresponding  to  r,  and  X  is  the  target 
variable  which,  at  extraction  time,  is  bound  to  some  target  pattern  instance  to 
be  extracted  (either  a  tree  region  or  a  textual  string).  The  body  of  an  Elog  rule 
contains  atoms  that  jointly  restrict  the  intended  pattern  instances.  For  example, 
an  Elog  rule  corresponding  to  a  tree  filter  contains  in  its  body  an  atom  expressing 
that  the  desired  pattern  instances  should  match  a  certain  tree  path  and  another 
atom  that  binds  the  variable  5  to  a  parent-pattern  instance. 

In  the  example  program,  the  pattern  <tableseq>  is  used  to  extract  a  se¬ 
quence  of  tables  which  represent  records.  Observe  that  in  each  search  result 
page  of  eBay^  a  record  is  a  whole  table  consisting  of  a  single  table  row.  This 
sequence  of  tables  is  required  to  be  preceded  by  a  table  which  contains  the  word 
“Current” ,  and  to  be  followed  by  an  image  representing  a  horizontal  line. 
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ebaydocuinent(S,  X)  getDocumeiit(S  =  $1,X) 
tableseq(S,  X)  <—  ebaydocument(.,  S), 

subsq(S,  {★.body.  ★  .center,  []),  (.table,  0),  (.table,  0)»X), 
bef  ore(S,  X,  (★.tr,  [(elementtext,  Current,  substr)]),  0, 0,  _), 

after(S,  X,  (★.img,  [(src,  spacer.gif,  substr)]),  0, 0,  _,  _) 
record{S,  X)  tableseq(_,  S),  subelem(S,  .table,  X) 

itemdes{S,  X)  record(_,  S),  subelein(S,  (★.td.  ★  .content,  [(liref , ,  substr)],  X) 
price(S,  X)  ^  record(_,  S), 

subelem(S,  (★.td,  [(elementtext,  \var[Y].*,  regvar)]),  X), 
isC\irrency(Y) 

bids(S,  X)  record(_,  S),  subelem(S,  ★.td,  X),before(S,  X,  .td,  0, 30,  Y,  _) 
price(_,Y) 

date(S,  X)  record(_,  S),  subelem(S,  ★.td,  X),notafter(S,  X,  .td,  100) 
currency(S,  X)  price(„,  S),  subtext(S,  \var[Y],  X),  isCurrency(Y) 
pricewc(S,  X)  price(_,  S),  subtext(S,  [0  —  9]'^\.[0  —  9]^,X) 


Fig.  2.  Elog  Extraction  Program,  for  a  a  single  eBay  page 


The  rule  with  head  predicate  record{S,X)  in  Figure  2  identifies  all  tables 
within  a  specific  area,  which  is  the  instance  of  tableseq{-i  S) .  For  each  ground 
atom  tableseq(j}^  s)  (where  p  and  s  are  tree  regions),  this  rule  derives  atoms 
of  the  form  record{s,x)  for  each  table  x  contained  in  s.  Thus  the  variable  S 
identifies  the  context  of  the  extraction,  in  this  case,  these  are  the  instantiations 
of  tableseq.  Optionally,  the  body  of  an  Elog  rule  may  contain  further  atoms 
expressing  conditions  that  the  pattern  instances  should  additionally  satisfy.  In 
particular,  for  each  type  of  condition,  there  exists  a  built-in  predicate  (see  below). 

The  description  of  each  item,  (occurring  in  the  second  column  of  each  record) 
is  determined  by  the  extraction  rule  whose  head  is  itemdes{S,X).  The  first 
atom  in  the  rule  body  specifies  that  the  context  S  of  the  extraction  is  a  table 
and  ensures  that  the  variable  S  is  instantiated  with  a  table.  The  second  atom  in 
the  rule  body  looks  for  subelements  of  the  table  that  qualify  as  table  columns 
with  some  specific  properties,  in  particular  requiring  that  they  contain  a  link 
(href).  The  rule  has  as  many  matches  as  there  are  items  on  the  given  page. 
If  the  Web  page  is  updated  and  two  new  records  are  inserted  into  the  table, 
then  the  same  rule  will  produce  two  more  matches.  Each  match  gives  rise  to  a 
corresponding  instantiation  of  the  variable  X. 

Thus,  the  head  predicates  defined  by  an  Elog  program  represent  the  extrac¬ 
tion  patterns  defined  by  the  wrapper  program.  For  instance,  the  program  in 
Figure  2  defines  patterns  such  as  <record>,  <itemdes>.  Elog  rule  bodies  con¬ 
tain  the  following  important  ingredients.  For  a  more  detailed  discussion  about 
Elog  predicates  see  Section  4.4  of  [5]. 

Incompletely  specified  tree  paths.  These  refer  to  the  position (s)  of  the  de¬ 
sired  element(s)  in  the  HTML  tree.  More  details  on  the  used  document  model 
are  specified  in  [5].  There  are  various  ways  to  specify  a  tree  path  pointing  to 
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Document  Pattern 


Bbaydocument  .> 

Tree  Pattern  . i . 

T  taWeseq  ) 

"fju _ 


^  record/ebayitem  ^ 


Crrl  C^j 

/  ^ 

y  currency  \  /  pricawc\ 


String  Pattern 


Fig.  3.  Pattern  Structure  of  Example  of  Figure  2 


e.g.  a  table  row  in  an  eBay  page.  The  fully  specified  tree  path  to  this  node 
is:  body.table.tr  (the  elements  satisfying  these  paths  are  referred  to  as  matched 
pattern  instances).  Two  incompletely  specified  tree  paths  to  the  same  node  are 
.  ★  .body.  ★  .tr  and  .  ★  .body.  ★  .table.  ★  .tr,  where  the  star  signs  are  wildcards 
(the  dots  just  act  as  concatenation  sign).  An  incompletely  specified  tree  path 
.'k.name  is  an  abbreviation  of  the  skip-to  sequence  {E  —  name)*name  where  E  is 
the  alphabet  of  element  types.  The  first  discovered  elements  of  the  t5q)e  “name” 
are  considered  in  all  possible  paths.  Observe  that,  interpreting  the  star  in  this 
way,  a  tree  path  .-k. table  identifies  only  the  outermost  tables  in  a  document,  and 
hence  acts  as  some  kind  of  minimization. 

Attribute  Conditions.  An  incompletely  specified  tree  path  may  be  too  general 
for  describing  an  intended  extraction  target.  In  that  case,  additional  atoms  in 
the  rule  body  may  express  further  restricting  conditions.  Among  these  are  so- 
called  attribute  conditions.  Attribute  conditions  impose  restrictions  on  matched 
elements.  For  example,  leaf  nodes  of  the  HTML  tree  representing  text  strings  may 
have  a  font-style  attribute  which  takes  the  value  italics  if  the  represented  text 
is  in  italics.  Moreover,  we  treat  the  contents  of  an  element  as  special  attribute 
elementtext.  Consider  the  rule  for  tables  eq  in  Figure  2:  One  of  its  predicates 
uses  an  attribute  condition  expressing  that  the  elementtext  needs  to  contain  the 
word  “Current”  ( “contain”  due  to  the  substr  keyword)  This  attribute  condition 
restricts  the  tree  path  .  ★  .table ^  which  identifies  tables  by  limiting  the  matches 
to  those  text  fields  that  contain  the  word  “Current”.  Attribute  Conditions  may 
require  exact  matches  or  partial  matches,  or  satisfaction  of  a  particular  regular 
expression  possibly  extended  by  the  use  of  variables. 

Element  Characterizations.  A  set  of  elements  of  a  subtree  of  an  HTML  tree 
are  identified  with  a  tree  path  (starting  from  the  subtree  root),  where  addition¬ 
ally  a  set  of  attribute  conditions  is  satisfied.  Such  a  characterization  is  called 
an  element  -path  definition.  Equivalently,  XPath  expressions  can  be  used  instead 
(with  some  extensions,  such  as  the  possibility  to  express  that  an  attribute  value 
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is  a  concept  like  “isCity”).  To  simplify  presentation,  however,  we  stick  to  our 
introduced  notation.  A  set  of  substrings  can  be  identified  by  using  a  string  path 
definition,  which  can  either  be  a  regular  expression,  or  refer  to  a  concept,  or 
even  combine  both.  Consider  the  example  of  Figure  2,  in  which  the  rule  defining 
<currency>  refers  to  a  variable  whose  instances  are  currencies. 

Tree  Extraction  Definition  Predicates.  These  predicates  specify  that  a  vari¬ 
able  should  be  instantiated  with  a  node  in  the  HTML  tree  which  matches  an 
element  path  definition.  See,  for  example,  the  suhelem  atom  of  the  fourth  rule  in 
Figure  2,  where  the  variable  X  is  instantiated  to  all  those  text  fields  that  occur 
within  <record>  and  contain  a  link.  The  variable  S  in  this  atom  denotes  the 
super  entity  or,  as  we  call  it,  the  parent  pattern,  from  which  the  current  target 
should  be  extracted  via  suhelem.  This  parent  pattern  instance  is  constrained  to 
be  an  instance  of  <record>  by  the  first  atom  of  the  rule.  Note  that  the  tree 
path  specified  in  a  tree  extraction  definition  predicate  is  always  relative  to  the 
parent  pattern,  i.e.,  its  starting  point  is  a  node  corresponding  to  the  parent  pat¬ 
tern  (in  our  example  rule,  an  instance  of  <record>).  Moreover,  with  subregion, 
a  sequence  of  elements  can  be  extracted  (e.g.  used  in  tables  eg  in  Figure  2). 

String  Extraction  Definition  Predicates.  In  the  HTML  parse  tree,  strings 
are  represented  by  the  text  of  leaves  of  type  content  However,  we  associate  a 
string  Cn  to  ever^  node  n  of  the  parse  tree  by  simply  concatenating  (in  left- 
to-right  order)  all  strings  corresponding  to  leaves  of  the  subtree  rooted  in  n. 
The  string  Cn  associated  to  node  n  is  available  in  the  Lixto  system  as  the 
value  of  an  additional  attribute  elementtext  of  any  given  node  n.  Several  special 
conditions  that  express  restrictions  on  such  elementtexts  can  be  expressed  in 
Elog.  Elog  predicates  expressing  such  special  string  conditions  are  referred  to 
as  string  extraction  definition  predicates.  As  an  example,  consider  the  final  two 
rules  of  the  program  of  Figure  2.  The  last  rule  uses  a  regular  expression  as  string 
path  definition,  the  other  one  a  variable  reference  to  a  concept  atom  (explained 
below).  Moreover,  Attribute  Extraction  Predicates  such  as  subatt  (see  examples 
in  Section  6)  allow  to  extract  the  contents  of  attribute  values. 

Contextual  Conditions.  Contextual  conditions  specify  that  some  other  ele¬ 
ments  must  or  must  not  appear  either  before  or  after  some  instance.  These  con¬ 
textual  elements  are  not  limited  to  text  elements.  For  example,  on  a  page  with 
several  tables,  the  final  table  could  be  identified  by  an  external  condition  stating 
that  no  table  appears  after  the  desired  table.  The  rule  defining  a  <tableseq> 
uses  both  an  after  and  a  before  condition  to  express  that  one  is  interested  in 
exactly  the  region  between  some  specified  elements.  The  definition  of  <date> 
uses  a  notafter  condition  to  express  that  the  column  which  contains  the  date  is 
not  followed  by  another  column. 

Internal  Conditions.  Such  conditions  require  that  some  characteristic  feature 
must  or  must  not  appear  within  an  instance.  Imagine,  one  wants  to  extract  all 
tables  containing  a  word  typeset  in  italics.  This  could  be  obtained  by  adding 
an  internal  condition  called  contains  to  the  body  of  the  rule  that  defines  the 
pattern  <record>.  This  condition  expresses  that  in  the  subtree  rooted  at  the 
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node  representing  the  desired  table  row,  a  node  must  exist  whose  font-style 
attribute  is  defined  and  has  the  value  italics. 

Concept  Conditions.  These  predicates  define  concepts  of  some  built-in 
top-level  ontology.  For  example,  one  may  enrich  the  system  with  predicates 
isEmail{X),  isCountry{X)^  or  isCurrency{X)  (see  Figure  2),  stating  that  a 
string  X  represents  an  email  address,  a  country,  or  a  currency,  respectively. 
These  values  of  the  variable  X  are  created  as  output  of  concept  attribute  con¬ 
ditions  or  string  path  definitions  (using  \t>ar[X]).  They  are  not  required  to  be 
unary,  e.g.  isDate{X^  Y)  is  a  binary  predicate  with  output  Y  in  standard  date 
format. 

Comparison  Conditions.  These  are  predefined  relations  for  predefined  onto¬ 
logical  classes  of  elements.  Using  these  conditions,  one  can  e.g.  compare  two  dates 
(binary  predicate),  or  require  that  an  email  address  exists  (unary  predicate) . 

Pattern  References.  Each  standard  filter  contains  a  reference  to  its  parent 
pattern  which  defines  the  context  of  a  rule.  For  example,  see  the  rule  defining 
<itemdes>  in  Figure  2.  It  refers  to  <record>  as  parent.  The  substitution  for  S 
is  the  actual  tree  region  which  acts  as  parent  instance.  Moreover,  additional 
pattern  references  can  be  used,  for  instance  to  express  that  an  instance  of  some 
pattern  always  occurs  after  an  instance  of  another  pattern.  Such  additional  pat¬ 
tern  references  open  the  way  for  reference  recursion  (see  Section  7  for  details) . 

Range  Conditions.  A  range  condition  further  restricts  the  set  of  pattern  in¬ 
stances  extracted  by  a  filter  by  selecting  only  a  subset  of  the  pattern  instances 
which  satisfy  the  conditions  in  the  body  of  the  filter.  Indeed  the  pattern  in¬ 
stances  extracted  from  a  certain  parent  pattern  instance  are  ordered  according 
to  their  position  in  the  document,  and  a  range  condition  selects  only  those  pat¬ 
tern  instances  that  belong  to  the  required  range  of  solutions.  To  any  rule  a  range 
condition  such  as  “[3,7]”  can  be  added,  indicating  that  the  solution  only  includes 
the  third  up  to  the  seventh  matched  target.  Counting  can  occur  starting  with 
the  first  or  with  the  last  instance. 

Using  the  above  predicates,  a  standard  extraction  rule  looks  as  follows: 

New(S,  X)  ^  Par(_,  S),  Ex(S,  X),  Co(S,  X, . .  .)[a,  b] 

where  S  is  the  parent  instance  variable,  X  is  the  pattern  instance  variable, 
Ex{S,  X)  is  an  extraction  definition  predicate,  and  the  optional  Co{S,  A, . . .)  are 
further  imposed  conditions.  A  tree  (string)  extraction  rule  uses  a  tree  (string) 
extraction  definition  atom  and  possibly  some  tree  (string)  conditions  and  general 
conditions.  The  numbers  a  and  b  are  optional  and  serve  as  range  parameters.  New 
and  Par  are  pattern  predicates  referring  to  the  parent  pattern  and  defining  the 
new  pattern,  respectively.  This  standard  rule  reflects  the  principle  of  aggregation. 

The  semantics  of  a  rule  is  given  as  the  set  of  matched  targets  x:  A  substitu¬ 
tion  Sy  X  foT  S  and  X  evaluates  New{s^x)  to  true  iff  all  atoms  of  the  body  are 
true  for  this  substitution.  Only  those  targets  are  extracted  for  which  the  head 
of  the  rule  resolves  to  true.  Moreover,  if  the  extraction  definition  predicate  is  a 
subsequence  predicate,  only  minimal  instances  are  matched  (i.e.  instances  that 
do  not  contain  any  other  instances).  This  is  a  nonmonotonic  concept  discussed 
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in  Section  7.  Observe  that  range  criteria  are  applied  after  non-minimal  targets 
have  been  sorted  out.  Note  that  range  conditions  are  well-defined  only  in  the 
case  of  no  reference  recursion  (cf.  to  Section  7). 

A  pattern  definition  (for  short,  pattern)  is  a  set  of  extraction  rules  defining  the 
same  head.  We  distinguish  document,  tree  and  string  patterns.  To  tree  patterns, 
only  tree  extraction  rules  can  be  asserted,  and  to  string  patterns  only  string 
extraction  rules.  The  third  kind  of  patterns,  document  patterns,  are  discussed 
in  the  next  section.  A  pattern  acts  like  a  disjunction  of  rule  bodies:  To  be  an 
extracted  instance  of  a  pattern,  a  target  needs  to  be  in  the  solution  set  of  at 
least  one  rule.  The  set  of  matched  target  instances  of  a  pattern  additionally  obeys 
a  minimality  criterion  (see  Section  7).  In  patterns,  even  in  those  consisting  of 
a  single  rule,  overlapping  targets  may  occur.  Observe  that  we  do  not  pose  the 
requirement  that  each  rule  belonging  to  a  given  pattern  refers  to  the  same  parent 
pattern.  This,  together  with  the  capability  of  document  navigation,  allows  for 
recursion  over  patterns  as  explained  in  more  detail  in  Section  6. 

An  extraction  program  P  is  a  set  of  patterns.  Elog  program  evaluation  differs 
from  Datalog  evaluation  in  the  following  three  aspects:  The  use  of  built-in  pred¬ 
icates,  various  kinds  of  minimization,  and  the  use  of  range  conditions.  Moreover, 
atoms  are  not  evaluated  over  an  extensional  database  of  facts  representing  a 
Web  page,  but  directly  over  the  parse  tree  of  the  Web  page. 

The  application  of  a  program  to  an  HTML  page  creates  a  set  of  hierarchically 
ordered  tree  regions  and  string  sources  (called  a  pattern  instance  base)  by  ap¬ 
plying  all  patterns  of  the  program  to  a  given  and  possible  further  HTML  pages 
(see  the  notion  of  document  filters  in  Section  4).  Each  pattern  produces  a  set 
of  instances.  Each  pattern  instance  contains  a  reference  to  its  parent  instance. 
Observe  that  the  pattern  instance  base  always  forms  a  forest,  regardless  of  the 
structure  of  the  pattern  graph.  We  consider  the  instances  of  document  filters  as 
root  node  of  each  tree  of  this  forest.  The  pattern  instance  base  can  be  translated 
into  XML  as  already  described  in  Section  2. 

4  A  Closer  Look  at  some  Lixto  Features 

In  this  section,  we  discuss  some  more  advanced  features  of  Lixto,  in  particular 
two  further  kinds  of  rules.  A  standard  rule  reflects  the  principle  of  aggregation, 
however,  designers  of  wrappers  sometimes  wish  to  express  specialization.  For 
instance,  if  one  rule  extracts  a  set  of  tables,  it  might  be  desirable  to  create  a 
rule  which  restricts  the  extracted  tables  to  those  which  contain  some  particular 
feature.  A  specialization  rule  looks  as  follows: 

New(S,  X)  ^  01d(S,  X),  Co(S,  X, . .  .)[a,  b] 

In  such  a  rule  a  pattern  is  specialized,  i.e.  some  of  the  parent-pattern  in¬ 
stances  are  returned  as  pattern  instances  of  the  new  pattern  definition.  It  does 
not  contain  a  parent-pattern  reference  and  an  extraction  definition  atom;  in¬ 
stead  it  only  contains  a  pattern  reference.  Observe  that  equally  to  specialization 
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rules,  generalization  rules  can  be  used  by  simply  creating  multiple  specialization 
rules  for  one  pattern  which  refer  to  different  patterns  and  do  not  contain  any 
conditions.  Another  kind  of  rule  is  the  document  rule^  using  a  getDocument(S,X) 
atom,  where  5*  is  a  string  source  representing  an  URL,  and  X  the  Web  page  the 
URL  points  to.  With  such  rules,  one  can  crawl  to  further  documents. 

New(S,  X)  Par(_,  S),  getDocuinent(S,  X) 

Each  Elog  program  has  an  initial  rule  using  the  getDocument  atom  with 
user-specified  input.  The  initial  document  rule  is  the  only  rule  without  a  parent- 
pattern  reference.  Instead,  it  uses  a  variable  “$1”  (or  a  fixed  URL)  which  is 
instantiated  to  a  string  source  representing  an  URL  during  run  time  (the  start 
document).  Document  filters  can  be  applied  to  document  patterns  only.  Parents 
of  tree  patterns  are  either  tree  or  document  patterns,  parent  of  string  patterns 
are  tree  or  string  patterns,  and  parents  of  document  patterns  are  string  patterns. 

Figure  4  illustrates  the  use  of  document  rules  together  with  specialization 
rules.  This  example  moreover  illustrates  the  use  of  disjunctive  pattern  defini¬ 
tions  pointing  to  two  different  parents  which  actually  evolved  in  this  case  from 
two  different  kind  of  documents.  Consider  the  root  pattern  <document>  and 
its  child  patterns  <ebaydocument>  and  <yahoodocument>.  Both  are  specializa¬ 
tions  requiring  that  the  document  is  an  eBay  page  (a  category  search  result  on 
WWW .  ebay .  com  such  as  http :  //listings .  ebay .  com/aw/plistings/list/all/ 
category3707/index.html),  or  a  yahoo  auctions  page  (i.e.,  a  search  result  of 
auctions .  yahoo ,  com),  respectively.  Observe  that  the  patterns  <ebaydoc\ament> 
and  <yahoodocument>  are  not  document  patterns,  but  tree  patterns,  since  they 
refer  to  instances  of  tree  regions.  The  predicate  contains  is  an  internal  condition, 
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expressing  that  there  is  an  element  in  X  which  satisfies  the  given  element  path 
definition. 

document  (S,  X)  +—  getDocument(S  =  $1,X) 
ebaydocument(S,  X)  document (S,  X), 

contains(X,  (★.body,  [(elementtext,  eBay,  substr)]),  _) 
yalioodocTiment(S,  X)  ^  document(S,  X), 

contains(X,  (★.body,  [(elementtext,  Yahoo,  substr)]),  _) 

5  Disjunctive  Pattern  Construction 

There  are  several  cases,  where  it  is  necessary  to  define  more  than  one  filter  for 
the  same  pattern  to  express  how  to  extract  desired  pieces  of  information  from  a 
Web  page.  In  this  section  we  show  some  real  world  examples  where  it  is  useful  to 
define  a  pattern  using  a  disjunction  of  filters.  Moreover,  we  show  that  is  generally 
possible  that  different  filters  of  the  same  pattern  can  extract  information  from 
different  parent  patterns.  Let  us  first  consider  an  example  where  a  wrapper 
designer  wants  to  define  a  pattern  consisting  of  filters  that  describe  extraction 
targets  for  different  page  types.  Assume  a  wrapper  extracts  prices  from  two  kind 
of  Web  pages  displaying  books  and  their  prices,  where  pages  of  the  first  kind 
are  US  pages  and  pages  of  the  second  kind  are  UK  pages.  The  characteristic 
features  of  prices  are  a  dollar  sign  on  US  Web  pages  and  a  pound  sterling  sign 
on  UK  pages.  Assume,  furthermore,  the  current  sample  page  is  a  US  page.  A 
pattern  named  <price>  should  thus  be  defined  via  two  filters:  the  first  taking 
care  of  US  pages  and  the  second  of  UK  pages.  After  having  visually  created  an 
appropriate  filter  for  prices  in  USD  on  an  already  loaded  US  sample  page,  the 
designer  switches  to  a  UK  sample  page  and  visually  defines  the  second  filter  for 
the  <price>  pattern  on  that  page.  The  wrapper  then  works  on  both  types  of 
pages. 

In  Lixto  it  is  not  only  possible  to  create  a  pattern  consisting  of  several  filters, 
but  also  that  filters  of  a  particular  pattern  definition  refer  to  a  different  parent 
pattern.  Again,  consider  the  example  in  Figure  4.  For  both  the  <ebaydocument> 
and  the  <yahoodocumeiit>  pattern  we  now  have  to  extract  the  list  of  available 
items  (records).  Since  records  are  structured  differently  in  eBay  and  yahoo  auc¬ 
tions,  it  is  necessary  to  create  for  each  kind  of  page  a  record  pattern  of  its  own, 
i.e.  <ebayrecord>  and  <yahoorecord>.  Once  we  have  defined  the  patterns  for 
the  records,  the  patterns  <itemdes>,  <price>,  <bids>  and  <date>  can  be  easily 
defined  with  one  filter  for  each  kind  of  record.  Although  this  wrapper  works  fine 
for  both  yahoo  and  eBay  auctions,  it  still  only  returns  results  from  one  summary 
page  as  it  does  not  follow  the  “next”  link,  and  also  is  not  capable  of  extracting 
detail  information.  Moreover,  using  the  pattern  <itemdes>  as  parent,  a  string 
pattern  URL  is  defined  using  an  attribute  filter.  This  attribute  filter  extracts  the 
value  of  the  link  to  detailed  information  of  the  particular  item.  This  attribute 
filter  works  for  both  sites,  since  both  store  the  URL  pointing  to  the  detail  page 
in  the  corresponding  href  attribute. 

URL(S,  X)  ^  itemdes(_,  S),subatt(S,href,X) 
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An  attribute  filter  uses  the  extraction  definition  predicate  suhatt  to  extract 
an  attribute  value  of  instances  of  S  and  instantiates  a  string  source  X  with  it. 
The  following  additional  features  are  currently  implemented  and  can  be  added 
via  Lixto^s  XML  Tool: 

1.  The  pattern  <ebayrecord>  and  <yahoorecord>  can  be  both  mapped  to  the 
XML  element  <record>,  and  an  attribute  source  of  <record>  can  be  defined, 
which  takes  the  constant  value  eBay  or  yahoo,  respectively. 

2.  In  case  the  string  source  of  <URL>  is  a  relative  URL,  a  prefix  variable  (BASE) 
can  be  added  to  it,  which  has  the  value  of  the  base  URL  of  the  document 
from  which  the  information  is  extracted.  This  variable  can  also  be  used  for 
following  relative  links  when  crawling  to  further  pages  (see  next  section). 

3.  Auxiliary  patterns  such  as  <ebaydocuiiient>  and  <yahoodocument>  can  be 
decided  to  not  being  mapped  to  XML,  and  a  DTD  can  be  created  by  addi¬ 
tionally  assigning  a  multiplicity  to  each  data  type  (Figure  4). 
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Fig.  5.  Ebay  item  description  page 


6  Web  Crawling  and  Recursive  Wrapping 

6.1  Following  Links 

For  each  item,  eBay  pages  contain  a  reference  to  a  page  containing  detailed 
information  about  the  item  itself.  In  the  previous  section,  we  have  shown  how  to 
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extract  the  URL  pointing  to  detail  pages,  but  we  did  not  further  use  it.  In  this 
section  we  extend  the  wrapper  program  to  extract  also  the  detailed  description 
of  each  item.  This  is  an  instance  of  a  general  class  of  applications,  where  a 
wrapper  needs  to  collect  and  group  together  elements  from  several  pages.  The 
wrapper  designer  thus  needs  to  “teach”  the  system  on  the  base  of  sample  pages 
how  to  follow  URLs  and  collect  the  elements  from  the  different  pages.  On  eBay^ 
each  item  is  described  by  a  line  stating  summary  information  for  each  given 
auction  item.  Each  such  line  contains  a  link  to  a  Web  page  with  more  detailed 
information  on  the  respective  item,  such  as  the  seller  name  and  the  shipping 
information  (Figure  5). 

The  designer  adds  a  child  document  pattern  <detaildocuinent>  to  the  string 
pattern  <URL>  which  resulted  from  extracting  the  value  of  the  href  attribute 
of  <itemdes>.  For  this,  the  designer  proceeds  by  following  one  example  detail 
document,  loading  the  corresponding  page,  and  defining  the  remaining  relevant 
patterns  (such  as  “sellername”  and  “shippinginfo”)  as  child  patterns  of  this  new 
document  pattern.  Figure  6  illustrates  an  expanded  Elog  program  of  Figure  3, 
which  defines  an  attribute  filter  extracting  an  URL  (as  in  Figure  4),  and  a  further 
document  pattern  consisting  of  one  filter  to  extract  detailed  information  for  each 
item.  The  auxiliary  patterns  <URL>  and  <detaildocument>  are  not  mapped  to 
XML  via  the  XML  translation  scheme.  The  navigation  to  a  detail  document 
looks  as  follows: 


URL(S,X)  •(—  itemdes(_,  S),subatt(S,  hrefjX) 
detaildocumeiit(S,  X)  URL(_,  S),  getDocumeiit(S,  X) 


Fig.  6.  Following  Links 
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6.2  Recursive  Wrapping 

As  we  have  already  pointed  out,  each  filter  of  a  given  pattern  may  refer  to 
a  different  parent  pattern.  Here,  we  show  how  to  apply  this  feature  to  reuse 
patterns.  This  paves  the  way  for  creating  recursive  programs.  We  call  this  kind 
of  recursion  pattern  recursion.  Another  kind  of  recursion,  reference  recursion, 
based  on  pattern  references  is  discussed  in  Section  7. 

Let  us  first  consider  the  example  program  below. 

document(S,  X)  getDocument($l,  X) 

table(S,  X)  dociiment(_,  S),  subelem(S, .  *  .table,  X) 
tablets,  X)  table(_,  S),  subelem(S, .  ★  .table,  X) 

It  extracts  all  nested  tables  within  one  page,  starting  with  the  outermost,  and 
stores  them  in  this  hierarchical  order  in  the  pattern  instance  base.  The  second 
rule  of  <table>  is  iteratively  called,  until  no  further  table  can  be  extracted. 

Another  possible  use  of  recursively  defined  wrappers  is  the  following  real- 
world  application.  Usually  a  wrapper  designer  does  not  want  to  extract  data 
from  a  single  eBay  page  on  notebooks,  but  from  all  pages  which  are  connected 
to  each  other  via  a  “next  page”  link.  We  illustrate  how  the  eBay  program  of 
Figure  6  can  be  extended  to  follow  the  next  link  and  can  reuse  the  already 
created  pattern  structure.  Thus,  the  pattern  <ebaydocument>  is  a  document 
pattern  consisting  of  two  filters  with  different  parents.  The  first  one  refers  to 
the  specified  start  document,  whereas  the  second  one  follows  the  “next”  link  on 
each  page.  This  part  of  the  program  looks  as  follows: 

next(S,  X)  ^  ebaydociiment(_,S), 

subelem(S,  (★.content,  [(href, ,  substr), 
(elementtext,  (next  page),  exact)]),  X) 
nextnrl(S,  X)  •(—  next(_,  S),  subatt(S,  href,  X) 
ebaydocument(S,  X)  4—  getDocument(S  =  $1,X) 
ebaydocument(S,  X)  •<—  nexturl(_,  S),  getDociiment(S,  X) 

Recall  that  ”$1”  is  interpreted  as  a  constant  whose  value  is  the  URL  of 
the  start  document  of  a  Lixto  session.  This  initial  filter  was  already  present  in 
the  previous  example,  and  is  the  starting  point  of  evaluation.  The  second  filter 
refers  to  a  different  parent  pattern,  which  is  <nexturl>.  Instances  of  the  pattern 
<nexturl>  are  string  sources  which  represent  an  URL.  The  pattern  <nexturl> 
is  created  via  an  attribute  filter  which  extracts  via  subatt  the  value  of  “href’ 
present  in  the  element  which  contains  the  text  “next  page” . 

In  the  second  rule  defining  the  pattern  <ebaydocument>,  the  variable  S  is 
instantiated  with  string  sources  which  represent  URLs.  For  each  “next”  link, 
a  new  instance  of  <ebaydocument>  is  created,  pointing  to  the  next  page.  This 
new  page  serves  as  parent  pattern  for  <tableseq>  and  <next>.  The  pattern 
structure  is  hence  re-used  for  this  new  page.  In  this  example,  two  different  doc¬ 
ument  patterns  are  used,  on  the  one  hand  <ebaydocument>,  on  the  other  hand 
<detaildocument>.  Instances  of  the  pattern  <ebaydocument>  are  the  summairy 
pages,  whereas  instances  of  <detaildocument>  are  the  detail  information  pages 
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for  each  item.  In  an  XML  translation  scheme,  the  wrapper  designer  moreover 
wants  to  state  how  the  documents  are  arranged  inside  the  XML  document.  Al¬ 
though  further  instances  of  <ebaydocument>  are  hierarchically  embedded  in  the 
previous  one,  the  wrapper  designer  may  maintain  all  <record>  instances  on  the 
same  level. 

In  the  visual  interface  of  Lixto,  a  document  pattern  can  be  generated  without 
the  need  to  manually  define  auxiliary  patterns.  Instead  visual  guidance  is  offered 
for  creating  a  single  rule  which  uses  a  sequence  of  extraction  definition  predicates. 
For  this  example  program,  this  single  rule  can  be  represented  as  follows: 

ebaydocumeiit(S,  X)  ebaydocument(_,  S),  subatt(Y,  href,  Z),  getDocuinent(Z,  X) 

subelem(S,  (★. content,  [(href, ,  substr), 

(elementtext,  (next  page),  exact)]),  Y), 


i 


nexturt  as  parent  , . IJnQJiarent  reference) 


/  nexturl  r^ferenc^y^-  ebaydocument  > 


date  j  Itemdes  ^  price  ^  Wds  ^ 

. . .  ~jEr  f  ^ 

<;  detaildooumeni  URL  \  / cjirencyX  /pricewc\ 

.  T  X  ;; . 

^  sellemame  ^  shippinginfo  ^ 


Fig.  7.  Recursive  Extraction 


7  Nonmonotonic  Issues 

Minimization  of  pattern  instances.  The  set  of  matched  targets  of  an  Elog  pattern 
are  minimized  in  the  way  that  pattern  instances  which  contain  other  instances  of 
the  same  pattern  w.r.t.  the  same  parent-pattern  instance,  are  ignored.  Pattern 
minimization  applies  both  to  tree  and  string  rules.  If  a  pattern  consists  of  a 
single  filter,  the  minimized  set  of  its  matched  targets  equals  the  initial  set  except 
if  the  extraction  definition  predicate  of  the  filter  is  subregion  (which  extracts  a 
sequence  of  elements). 
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Consider  the  following  simple  example.  Assume  the  major  headlines  of  a 
particular  newspaper  Web  page  are  a  table  consisting  of  various  table  data  (the 
wrapper  designer  is  interested  in  all  the  contents) ,  and  the  minor  headlines  of  the 
same  newspaper  which  appear  at  the  same  page,  are  table  columns  of  another 
table.  The  minor  headlines  are  moreover  characterized  by  a  red  font,  and  the 
major  headlines  contain  a  link  {href)  somewhere.  However,  the  table  containing 
the  data  of  all  minor  headlines  also  contains  links  (i.e.  the  href  attribute  is 
a  characteristic  attribute  occurring  in  these  two  tables  only).  A  program  for 
extracting  all  headlines  can  be  written  in  the  following  way,  where  par  is  the 
parent  pattern  identifying  the  relevant  area  of  the  newspaper  page. 

headline (S,  X)  par(_,  S),  subelem(S, .  ★  .table,  X) 

contains(S,  (.content,  [(href, ,  substr)]),  X) 
headline (S,  X)  <—  par(_,  S),  subeleiii(S, .  ★  .td,  X) 

contains(S,  (.content,  [(font  —  color,  red,  exact)]),  X) 

Hence,  the  first  rule  also  matches  the  table  which  contains  all  minor  headlines. 
However,  since  in  this  table,  other  pattern  instances  are  matched,  too,  only  the 
minimal  instances  are  returned,  which  are  in  this  case  the  table  columns.  For 
the  major  headlines,  however,  the  table  is  extracted.  Another  example  is  the 
minimization  of  the  set  of  instances  generated  by  a  single  rule: 

tableseq(S,  X)  par(_,  S),  subregion(S, .  ★  .body.  ★  .center,  .table,  .table,  X) 

Such  a  rule  (with  additional  conditions)  is  used  in  the  eBay  program  of 
Figure  2.  However,  with  no  additional  condition,  the  semantics  is  to  extract 
all  possible  sequences  of  tables  and  to  minimize  the  result;  since  the  minimal 
sequences  of  tables  are  sequences  of  a  single  table,  this  rule  returns  such  instances 
only.  To  enforce  a  particular  longer  sequence  of  tables,  such  as  the  sequence  of 
tables  containing  the  relevant  data  of  sold  items,  some  before  and  after  conditions 
need  to  be  added.  In  the  case  of  eBay,  immediately  before  and  after  the  target 
instance  a  particular  text  or  image  shall  occur,  respectively.  This  returns  a  single 
pattern  instance,  the  sequence  of  desired  record  tables. 

Pattern  minimization  can  be  expressed  in  Elog  extended  with  stratified  nega¬ 
tion  and  a  suitable  built-in  predicate  contained An(X,Y)  expressing  offset-wise 
containment  of  X  in  Y.  In  particular,  a  set  of  filters  of  p{S,  X)  defining  the  pat¬ 
tern  p  is  rewritten  in  the  following  way.  Consider  the  initial  pattern  definition: 

p(S,  X)  ^  pari(_,  S),  Exi(S,  X),  Coi(S,  X, . . .) 

p(S,X)  . - 

p(S,  X)  <  parn(_,  S),  EXn(S,  X),  COii(S,  X, . . .) 

The  pattern  name  is  renamed  to  p'  and  additional  rules  are  added: 

p'(S,  X)  ^  pari(.,  S),  Exi(S,  X),  Coi(S,  X, . . .) 
p'(S,X)^-.. 

p'(S,  X)  ^  parn(_,  S),  Ex.(S,  X),  Con(S,  X, . . .) 
p"(S,X)  p'(S,X),p'(S,Xi),  contained_in(Xi,X) 

p(S,X)^p'(S,X),iiot  p"(S,X) 
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The  final  rule  requires  that  instances  of  X  and  Xi  are  both  from  the  same 
parent  pattern  instance  (otherwise,  if  they  stem  from  different  parent-pattern 
instances,  minimization  is  usually  undesired).  In  the  rewriting,  p'  is  the  pattern 
predicate  initially  being  built  by  different  filters.  Each  instance  p(s,a:),  which  is 
non-minimal,  i.e.  for  which  there  exists  a  smaller  valid  p"(s,x),  is  not  derived. 
Only  minimal  instances  are  derived. 

Ranges.  The  semantics  of  range  criteria  [a,  b]  of  a  filter  rule  NewPat{S,  X) 
filterbody[a,  6]  can  also  be  expressed  by  a  suitable  rewriting  of  the  rule.  A  range 
condition  assumes  that  an  order  relation  is  defined  among  pattern  instances  ex¬ 
tracted  by  the  same  parent  pattern  instance,  thus  in  the  rewriting  we  assume 
the  presence  of  a  predicate  greater{S,  X,  y)  which  evaluates  to  true  if  X  and  Y 
are  instances  derived  from  S  and  X  precedes  Y  (using  character  offsets  for  com¬ 
parison).  The  first  step  of  rewriting  consists  of  adding  a  new  predicate  NewPaf 
that  is  defined  by  a  unique  filter  NewPat'{S^X)  *(—  filterbody.  Then,  two  pred¬ 
icates  FirstSol  and  succ  are  defined.  FirstSol  selects  from  the  instances  in 
NewPat'  the  first  instance,  and  succ  defines  a  successor  relation  among  in¬ 
stances  in  NewPat'  (due  to  the  lack  of  space  we  omit  the  formal  definition). 
The  complete  rewriting  is  as  follows: 

NewPat(S,X)  4-  NewPat' (S,  X),  Solposition(S,  X,P),  a  <  P  <  b 
Solpositioii(S,  X,  1)  i—  NewPat'(S,  X),  FirstSol(S,  X) 

Solpositioii(S,X,P)  ^  Solpositioii(S,X',P'),NewPat'(S,X),succ(S,X',X),P  =  P'  +  1. 

In  both  predicates  FirstSol  and  succ,  the  predicate  NewPat'  appears 
negated,  hence,  the  predicate  NewPat  depends  on  negation  of  all  the  predi¬ 
cates  appearing  in  filterbody. 

Pattern  Reference  Recursion  and  Ranges.  Using  ranges  together  with  pattern 
references  might  introduce  unstratified  negation.  Using  pattern  references  can  in¬ 
troduce  reference  recursion.  Still,  without  ranges,  a  unique  model  is  returned. 
However,  additionally  allowing  range  conditions  to  occur  in  such  recursive  rules 
requires  to  use  a  semantics  akin  to  the  stable  model  semantics  (returning  multi¬ 
ple  models)  or  well-founded  semantics  (returning  a  minimal  model)  as  this  intro¬ 
duces  unstratified  negation  into  the  program  (considering  the  above  rewriting). 
For  the  following  example  (possibly  containing  additional  filters  for  p  and  ^),  a 
nonmonotonic  semantics  is  required. 

p(S,  X)  <—  par(_,  S),  subelem(S,  epd,  X),before(S,  X, . . . ,  Y),  q(S,  Y)[a,  b] 
q(S,  X)  4-  par(_,  S),  subeleiii(S,  epd,  X), bef ore(S,  X, . . . ,  Y), p(S,  Y)[c,  d] 

Observe  that  a  program  which  uses  range  and  pattern  recursion,  but  no 
reference  recursion,  is  always  locally  stratified,  i.e.  its  ground  instantiation  is 
stratified.  For  implementation  issues,  we  limit  pattern  references  in  the  way 
that  the  program  remains  locally  stratified.  This  is  a  subset  of  programs  whose 
rewriting  contains  only  stratified  negation. 
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8  Current/Future  Work 

Further  work  includes  to  consider  various  extensions  of  Elog  such  as  using  strat¬ 
ified  negation  instead  of  special  negative  predicates  like  notbefore^  to  extend 
handling  of  pattern  references  together  with  recursion  as  discussed  above,  to 
study  further  possibilities  of  conditions  such  as  universially  quantified  ones  (that 
require  all  elements  to  have  a  particular  feature),  and  complement  extraction 
(e.g.  to  remove  advertisments  from  Web  pages).  An  editor  of  Elog  rules  will  be 
offered  for  more  experienced  wrapper  designers  who  nevertheless  lack  program¬ 
ming  facilities.  This  editor  describes  Elog  patterns  using  a  colloquial  pattern 
description  language.  A  concept  editor  for  adding  syntactic  and  semantic  con¬ 
cepts  to  the  list  of  built-in  predicates  is  currently  under  construction.  Moreover, 
the  Lixto  prototype  is  currently  being  re-designed  as  servlet  version  allowing 
pattern  generation  in  the  user’s  favorite  browser.  Finally,  an  Elog2XSLT  con¬ 
version  tool  is  going  to  be  developed  which  will  transform  a  subset  of  possible 
Elog  programs  into  XSLT. 
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Abstract.  Every  logical  formalism  gives  rise  to  two  fundamental  algo¬ 
rithmic  problems:  model  checking  and  inference.  In  propositional  logic, 
the  model  checking  problem  is  polynomial-time  solvable,  while  the  infer¬ 
ence  problem  is  coNP-complete.  In  propositional  circumscription,  how¬ 
ever,  these  problems  have  higher  computational  complexity,  namely  the 
model  checking  problem  is  coNP-complete,  while  the  inference  problem 
is  112 -complete.  In  this  paper,  we  survey  recent  results  on  the  computa¬ 
tional  complexity  of  restricted  cases  of  these  problems  in  the  context  of 
Schaefer’s  framework  of  generalized  satisfiability  problems.  These  results 
establish  dichotomies  in  the  complexity  of  the  model  checking  problem 
and  the  inference  problem  for  propositional  circumscription.  Specifically, 
in  eaeh  restricted  case  the  model  checking  problem  for  propositional  cir¬ 
cumscription  either  is  coNP-complete  or  is  polynomial-time  solvable. 
Furthermore,  in  each  restricted  case  the  inference  problem  for  propo¬ 
sitional  circumscription  either  is  Ilf -complete  or  is  in  coNP.  These  di¬ 
chotomy  theorems  yield  a  complete  classification  of  the  “hard”  and  the 
“easier”  cases  of  the  model  checking  problem  and  the  inference  prob¬ 
lem  for  propositional  circumscription.  Moreover,  they  provide  efficiently 
checkable  criteria  that  tell  apart  the  “hard”  cases  from  the  “easier”  ones. 


1  Introduction 

Circumscription,  introduced  by  McCarthy  [McC80],  is  one  of  the  most  well  de¬ 
veloped  and  extensively  studied  formalisms  of  nonmonotonic  reasoning.  In  cir¬ 
cumscription,  formulas  of  a  logic  are  used  to  specify  properties  of  objects,  models 
of  formulas  are  ordered  according  to  a  suitable  partial  order,  and  preference  is 

*  Part  of  this  research  was  carried  out  while  on  sabbatical  at  the  University  of  Cali¬ 
fornia,  Santa  Cruz.  Research  partially  supported  by  the  Research  Commmittee  of 
the  University  of  Patras  and  by  the  Computer  Technology  Institute. 
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given  to  models  that  are  minimal  with  respect  to  this  partial  order.  The  key  intu¬ 
ition  behind  the  focus  on  minimal  models  is  that  they  are  the  ones  that  embody 
common  sense,  because  they  have  as  few  “exceptions”  as  possible.  Consequently, 
circumscription  can  be  thought  of  as  an  application  of  Ockham’s  razor  principle 
(principle  of  parsimony)  to  the  formalization  of  common-sense  reasoning. 

Propositional  circumscription  is  the  basic  case  of  circumscription  in  which 
satisfying  truth  assignments  of  propositional  formulas  are  partially  ordered  ac¬ 
cording  to  the  coordinatewise  partial  order  <  on  Boolean  vectors,  which  extends 
the  order  0  <  1  on  {0, 1}.  Specifically,  if  a  =  (ai, . . . ,  an)  and  P  =  (6i, . . . ,  6„) 
are  two  truth  assignments,  then  a  <  P  holds  if  ai  <  bi  for  every  i  such  that 
1  <  i  <n.  K  minimal  model  of  a  propositional  formula  is  a  truth  assignment 
a  such  that  the  following  two  conditions  hold:  (1)  a{(p)  =  1;  (2)  if  is  a  truth  as¬ 
signment  such  that  p{(p)  =  1  and  P  <  a,  then  a  —  p.  For  example,  the  minimal 
models  of  the  formula  (x  V  2/)  A  (-ix  V  ?/)  A  (a:  V  ->?/)  are  (0, 1)  and  (1, 0). 

Every  logical  formalism  gives  rise  to  two  fundamental  decision  problems: 
model  checking  and  inference.  Intutitively,  the  former  is  the  problem  of  deciding 
whether  a  “structure”  satisfies  a  “formula”,  whereas  the  latter  is  the  problem 
of  deciding  whether  a  “formula”  can  be  inferred  from  another  “formula”  in  the 
context  of  the  formalism  under  consideration.  In  the  case  of  propositional  cir¬ 
cumscription,  these  two  problems  take  the  following  precise  form. 

Definition  1:  The  model  checking  problem  for  propositional  circumscription 
asks:  given  a  propositional  formula  ip  and  a  truth  assignment  a,  is  a  a  minimal 
model  of 

The  inference  problem  for  propositional  circumscription  asks:  given  two 
propositional  formulas  (p  and  'll; ^  is  ip  true  in  every  mininal  model  of  ip? 

We  write  ip  |=ciRC  V’  to  denote  that  ip  is  true  in  every  minimal  model  of  ip. 

□ 

It  has  been  shown  that  the  model  checking  problem  for  propositional  cir¬ 
cumscription  is  coNP-complete  (Cadoli  [Cad92]),  whereas  the  inference  problem 
for  propositional  circumscription  is  Il^-complete^  (Eiter  and  Gottlob  [EG93]). 
In  fact,  the  model  checking  problem  for  propositional  circumscription  remains 
coNP-complete  even  when  restricted  to  3CNF-formulas,  while  the  inference  prob¬ 
lem  ip  |=ciRC  for  propositional  circumscription  remains  Il^-complete  even 
when  is  a  3CNF-formula  and  ip  is  negative  literal  -lu.  These  results  quantify 
the  increase  in  computational  complexity  that  arises  when  making  the  transition 
from  ordinary  propositional  logic  to  propositional  circumscription,  since  in  the 
case  of  ordinary  propositional  logic  the  model  checking  problem  is  solvable  in 
linear  time  and  the  inference  problem  is  coNP-complete  (Cook  [Coo71]).  More¬ 
over,  these  results  raise  the  problem  of  identifying  restricted  cases  in  which  the 
model  checking  problem  and  the  inference  problem  for  propositional  circumscrip¬ 
tion  have  computational  complexity  lower  than  the  general  case.  To  this  effect, 
Cadoli  [Cad92,Cad93]  found  several  polynomial-time  solvable  cases  of  the  model 

^  The  class  Ilf  forms  the  second  level  of  the  polynomial  hierarchy  PH,  and  contains 
both  NP  and  coNP  as  subclasses  (see  [Pap94]). 
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checking  problem  for  propositional  circumscription;  in  a  similar  vein,  Cadoli 
and  Lenzerini  [CL94]  studied  restricted  cases  in  which  the  inference  problem  for 
propositional  circumscription  is  polynomial-time  solvable  or  is  in  coNP. 

In  studying  restricted  cases  of  an  algorithmic  problem,  ideally  one  would 
like  to  have  a  rich  conceptual  framework  that  makes  it  possible  to  express  a 
variety  of  restricted  cases  and  analyze  their  complexity.  For  Boolean  satisfiabil¬ 
ity,  such  a  framework  was  introduced  and  investigated  by  Schaefer  [Sch78],  who 
succeeded  in  obtaining  a  complete  classification  of  the  complexity  of  Boolean 
satisfiability  problems  in  this  framework.  Cadoli  [Cad92,Cad93]  proposed  that 
the  model  checking  problem  for  propositional  circumscription  be  investigated  in 
Schaefer’s  framework  and  raised  the  question  of  whether  it  is  possible  to  obtain 
a  complete  classification  of  its  complexity.  This  question  was  settled  affirma¬ 
tively  in  [KKOla];  moreover,  in  [KKOlb]  the  complexity  of  the  inference  problem 
for  propositional  circumscription  was  investigated  in  the  context  of  Schaefer’s 
framework  and  a  characterization  of  the  Il^-complete  cases  was  obtained. 

The  balance  of  this  extended  abstract  is  organized  as  follows.  In  Section  2, 
we  present  Schaefer’s  framework  and  state  his  main  results  on  the  complex¬ 
ity  of  Boolean  satisfiability  problems.  In  Section  3,  we  describe  our  results  on 
the  complexity  of  the  model  checking  problem  and  the  inference  problem  for 
propositional  circumscription  in  the  context  of  Schaefer’s  framework.  Finally,  in 
Section  4  we  discuss  certain  open  problems  and  directions  for  future  research. 


2  Schaefer’s  Framework  for  Boolean  Satisfiability 

A  logical  relation  R  is  a,  non-empty  subset  of  {0, 1}^,  for  some  A;  >  1.  Such  a 
logical  relation  can  be  thought  of  as  the  set  of  all  satisfying  truth  assignments  of 
a  generalized  propositional  connective  R'.  Schaefer  [Sch78]  investigated  Boolean 
satisfiability  problems  in  which  the  inputs  are  formulas  in  generalized  conjunctive 
normal  form,  that  is  to  say,  they  are  conjunctions  of  atomic  formulas  derived 
from  a  fixed  finite  set  of  logical  relations. 

Definition  2:  Let  S  ~  {Ri, . . . ,  R^}  be  a  finite  set  of  logical  relations  of  various 
arities,  let  5'  =  {i?i, . . . ,  R'^}  be  a  set  of  relation  symbols  whose  arities  match 
those  of  the  relations  in  S,  and  let  V  be  an  infinite  set  of  variables. 

-  A  CNF(S')-formula  is  a  finite  conjunction  Ci  A  . . .  A  (7n  of  clauses  built 
using  relation  symbols  from  5',  variables  from  V,  and  the  constants  0  and  1, 
that  is,  each  Ci  is  an  atomic  formula  of  the  form  -^^(^1,  • . .  jCa:)?  where  i?'-  is 
a  relation  symbol  of  arity  k  in  S',  and  each  is  a  variable  in  V  or  one  of 
the  constants  0  and  1.  The  semantics  of  CNF(5)-formul8Ls  are  defined  in  a 
standard  way  by  assuming  that  variables  range  over  the  set  of  bits  {0, 1},  each 
relation  symbol  Rj  in  S'  is  interpreted  by  the  corresponding  relation  Rj  in  S, 
and  the  constant  symbols  0  and  1  are  interpreted  by  0  and  1  respectively. 

-  Sat(5)  is  the  following  decision  problem:  given  a  CNF(5)-formula  (p,  is  it 

satisfiable?  (i.e.,  is  there  a  truth  assignment  to  the  variables  of  p  that  makes 
every  clause  of  p  true?)  □ 
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It  is  clear  that,  for  each  finite  set  S  of  logical  relations,  Sat(5)  is  a  problem  in 
NP.  Moreover,  the  family  of  all  Sat(5)  problems  contains  several  well-known 
variants  of  Boolean  satisfiability,  as  evidenced  by  the  following  examples. 

Example  3:  3-Sat,  the  prototypical  NP-complete  problem,  coincides  with  the 
problem  Sat(5),  where  S  =  {Rq,  Ri,  R2,  Rs}  and 

-  Rq  =  {Q^  1}3  -  {(0, 0, 0)}  (expressing  the  clause  {xVyW  z)); 

-  Ri  —  {0, 1}^  -  {(1, 0, 0)}  (expressing  the  clause  (-ix  V  2/  V  z))\ 

-  R2  —  {0,  Ip  -  {(1, 1, 0)}  (expressing  the  clause  (-ix  V  ->2/  V  z)); 

-  R2  =  {0, 1}^  -{(1,1,1)}  (expressing  the  clause  {-^x  V  -i?/  V  -»z)). 

Similarly,  but  on  the  side  of  tract  ability,  2- Sat  coincides  with  the  problem 
Sat(5),  where  S  =  {ro,Ti,T2}  and 

-  To  —  {0, 1}^  —  {(0, 0)}  (expressing  the  clause  (x  V  2/)); 

-  Ti  =  {0,  Ip  -  {(1, 0)}  (expressing  the  clause  (ix  V  2/)); 

-  T2  =  {0, 1}^  —  {(1, 1)}  (expressing  the  clause  (-ix  V  ->2/)). 

Positive- I-In-3-S AT  is  the  following  decision  problem:  given  a  3CNF-formula 
such  that  each  clause  is  of  the  form  {x  V  y  V  z),  does  there  exist  a  truth  as¬ 
signment  that  makes  true  exactly  one  variable  in  each  clause?  This  problem  is 
known  to  be  NP-complete  ([GJ79,  L04,  page  259]).  A  moments’  reflection  re¬ 
veals  that  Positive- I-In-3-S AT  coincides  with  the  problem  Sat(5),  where  S  is 
the  singleton  {i^i/a}  consisting  of  the  relation 


i?i/3  =  {(1,0,0),  (0,1,0),  (0,0,1)}. 

Furthermore,  it  is  easy  to  see  that  several  other  variants  of  Boolean  satisfiability, 
including  1-In-3-Sat,  Not- All- Equal- 3- S AT  and  Horn  3-Sat,  can  be  cast 
as  Sat(5)  problems  for  particular  sets  S  of  logical  relations.  □ 

The  above  examples  demonstrate  that  the  family  of  all  SAT(iS')  problems  con¬ 
stitutes  a  flexible  and  rich  framework  for  expressing  restricted  cases  of  Boolean 
satisfiability.  Schaefer  [Sch78]  studied  the  computational  complexity  of  Sat(5) 
problems  and  obtained  the  following  remarkable  classification  theorem:  for  ev¬ 
ery  finite  set  S  of  logical  relations,  either  Sat(S')  is  NP-complete  or  Sat(5)  is 
solvable  in  polynomial  time;  moreover,  there  is  an  algorithm  to  decide  whether 
Sat(5)  is  NP-complete  or  solvable  in  polynomial  time.  To  appreciate  Schae¬ 
fer’s  result,  one  should  recall  that  Ladner  [Lad75]  showed  that  if  P  ^  NP, 
then  there  are  problems  in  NP  that  are  neither  NP-complete  nor  in  P,  i.e,, 
there  exist  problems  of  intermediate  computational  complexity  between  NP- 
complete  and  polynomial-time  solvable.  Consequently,  Schaefer’s  result  can  be 
described  as  a  dichotomy  theorem  asserting  that  no  Sat(aS')  problem  is  of  such 
intermediate  computational  complexity.  In  fact,  Schaefer’s  result  was  the  first 
non-trivial  dichotomy  theorem  for  a  family  of  NP-complete  problems.  Since 
that  time,  dichotomy  theorems  have  been  obtained  for  several  other  families 
of  decision,  counting,  enumeration,  and  optimization  problems  (for  instance. 
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see  [FHW80,HN90,Cre95,CH96,CH97,KSW97]).  Overall,  however,  dichotomy 
theorems  for  families  of  algorithmic  problems  are  rare;  moreover,  in  view  of 
Ladner’s  theorem  [Lad75],  their  existence  cannot  be  taken  for  granted. 

Before  stating  Schaefer’s  dichotomy  theorem  in  precise  terms,  we  need  to 
introduce  several  necessary  concepts. 

Definition  4:  Let  be  a  propositional  formula. 

—  is  bijunctive  if  it  is  a  2CNF-formula,  i.e.,  it  is  a  conjunction  of  clauses  each 
of  which  is  a  disjunction  of  at  most  two  literals. 

—  (pis  Horn  if  it  is  the  conjunction  of  clauses  each  of  which  is  a  disjunction  of 
literals  such  that  at  most  one  of  them  is  a  variable. 

—  is  dual  Horn  if  it  is  the  conjunction  of  clauses  each  of  which  is  disjunction 
of  literals  such  that  at  most  one  of  them  is  a  negated  variable. 

—  pis  affine  if  it  is  the  conjunction  of  subformulas  each  of  which  is  an  exclusive 
disjunction  of  literals  or  a  negation  of  an  exclusive  disjunctions  of  literals 
(by  definition,  an  exclusive  disjunction  of  literals  is  satisfied  exactly  when 
an  odd  number  of  these  literals  are  true;  we  will  use  0  as  the  symbol  of  the 
exclusive  disjunction). 

Note  that  a  formula  p  is  affine  precisely  when  the  set  of  its  satisfying  assignments 
is  the  set  of  solutions  of  a  system  of  linear  equations  over  the  field  {0, 1}.  O 

Definition  5:  Let  R  he  a  logical  relation  and  S  a  finite  set  of  logical  relations. 

—  Ris  bijunctive  {Hom^  dual  Horn,  or  affine,  respectively)  if  there  is  a  proposi¬ 
tional  formula  p  which  is  bijunctive  (Horn,  dual  Horn,  or  affine,  respectively) 
and  such  that  R  coincides  with  the  set  of  truth  assignments  satisfying  p, 

—  S  is  Schaefer  if  at  least  one  of  the  following  four  conditions  hold: 

•  every  member  of  S  is  bijunctive; 

•  every  member  of  S  is  Horn; 

•  every  member  of  S  is  dual  Horn; 

•  every  member  of  S  is  affine. 

—  Otherwise,  we  say  that  S  is  non-Schaefer.  □ 

There  are  simple  criteria  to  determine  whether  a  logical  relation  is  bijunctive, 
Horn,  dual  Horn,  or  affine.  In  fact,  a  set  of  such  criteria  was  already  provided  by 
Schaefer  [Sch78];  moreover,  Dechter  and  Pearl  [DP92]  gave  even  simpler  criteria 
for  a  relation  to  be  Horn  or  dual  Horn.  Each  of  these  criteria  involves  a  closure 
property  of  the  logical  relations  at  hand  under  a  certain  function.  Specifically,  a 
relation  R  is  bijunctive  if  and  only  if  for  all  ti,  ^2,  ^3  ^  R,  we  have  that  (ti  Vt2)  A 
(i2Vf3)A(ii  Vts)  G  R,  where  the  operators  V  and  A  are  applied  coordinate-wise  to 
the  bit-tuples.  Note  that  the  z-th  coordinate  of  the  tuple  {ti  Vt2)  A(t2Vt3)  A(ti  Vts) 
is  equal  to  1  exactly  when  the  majority  of  the  i-th  coordinates  of  ti  ,t2,tz  is  equal 
to  1.  Thus,  this  criterion  states  that  R  is  bijunctive  exactly  when  it  is  closed 
under  coordinate-wise  applications  of  the  ternary  majority  function.  R  is  Horn 
(respectively,  dual  Horn)  if  and  only  if  for  all  ti,  ^2  €  R^  we  have  that  tiAt2  G  R 
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(respectively,  Finally,  R  is  affine  if  and  only  if  for  all  ^  R, 

we  have  that  0  ^2  ©  *3  C  R,  As  an  example,  it  is  easy  to  apply  these  criteria 
to  the  ternary  relation  i?i/3  =  {(1,0, 0),  (0, 1,0),  (0,0, 1)}  and  verify  that  Ri/^ 
is  neither  bijunctive,  nor  Horn,  nor  dual  Horn,  nor  affine. 

There  are  well-known  polynomial-time  algorithms  for  the  satisfiability  prob¬ 
lem  for  the  class  of  all  bijunctive  formulas  (2-Sat),  the  class  of  all  Horn  formulas, 
and  the  class  of  all  dual  Horn  formulas.  Moreover,  if  S  is  an  affine  set  of  logical 
relations,  then  Sat(5')  is  solvable  in  polynomial  time  using  Gaussian  elimina¬ 
tion.  Schaefer’s  seminal  discovery  was  that  these  four  cases  are  the  only  ones 
that  give  rise  to  tractable  cases  of  Sat(S). 

Theorem  6:  [Schaefer’s  Dichotomy  Theorem,  [Sch78]]  Let  S  be  a  finite  set  of 
logical  relations.  If  S  is  Schaefer,  then  Sat(5)  is  solvable  in  polynomial  time; 
otherwise,  it  is  is  'HV -complete. 

As  an  application.  Theorem  6  immediately  implies  that  Positive- I-In-3-S AT 
is  NP-complete,  since  it  coincides  with  SAT(i7i/3),  and  Rijz  is  not  Schaefer. 

To  obtain  the  above  dichotomy  theorems,  Schaefer  had  to  first  establish  a 
result  concerning  the  expressive  power  of  CNF{S)  formulas.  Informally,  this 
result  asserts  that  if  5  is  a  non-Schaefer  set  of  logical  relations,  then  CNF(S')- 
formulas  have  extremely  highy  expressive  power,  in  the  sense  that  every  logical 
relation  can  be  defined  from  a  CNF(S')-formula  using  existential  quantification. 


Theorem  7:  [Schaefer’s  Expressibility  Theorem,  [Sch78]]  Let  S  be  a  finite  set 
of  logical  relations.  If  S  is  non-Schaefer,  then  for  every  k-ary  logical  relation  R 
there  is  a  CNF  (S) -formula  (p{xi,. . .  ,Xk,zi,.. . ,  Zm)  such  that  R  coincides  with 
the  set  of  all  truth  assignments  to  the  variables  xi,...,Xk  that  satisfy  the  formula 
(Bzi)  •  ♦  •  {3zm)(p{xi,.  ..,Xk,Zi,...,Zm)- 

3  Model  Checking  and  Inference  in  Circumscription 

Schaefer’s  framework  makes  it  possible  to  introduce  and  study  restricted  cases 
of  the  model  checking  problem  and  the  inference  problem  for  propositional  cir¬ 
cumscription. 

Definition  8:  Let  5  be  a  finite  set  of  logical  relations. 

-  MC-Circ(5)  is  the  following  decision  problem:  given  a  CNF(5)-formula  cp 
and  a  truth  assignment  a,  is  a  a  minimal  model  of  cp? 

-  Inf-Circ(5)  is  the  following  decision  problem:  given  a  CNF(5)-formula  ip 

and  a  CNF-formula  “0,  is  ^  true  in  every  minimal  model  of  (pi  □ 

Using  the  definitions,  it  is  easy  to  see  that,  for  every  finite  set  S  of  logi¬ 
cal  relations,  MC-ClRC(5)  is  in  coNP  and  Inf-Circ(5)  is  in  Hf.  There  are 
natural  sets  S  of  logical  relations  such  that  MC-CiRc(5)  is  coNP-complete 
and  Inf-Circ(5)  is  H^-complete.  In  particular,  this  holds  true  for  the  set 
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S  =  {Ro,Ri,R2iR3}  of  logical  relations  in  Example  3  that  give  rise  to  3-Sat 
(see  [Cad92,Cad93,EG93]).  In  contrast,  as  pointed  out  in  [Cad92,Cad93,CL94], 
if  5  is  a  bijunctive  or  a  dual  Horn  set  of  logical  relations,  then  MC-CiRC(5) 
in  P  and  Inf-Circ(5')  is  in  coNP.  Moreover,  if  5  is  a  Horn  set  of  logical  re¬ 
lations,  then  both  MC-CiRC(5)  and  Inf-Circ(5')  are  in  P;  this  is  so  because 
every  satisfiable  Horn  formula  has  a  minimum  (unique  minimal)  satisfying  truth 
assignment  that  can  be  found  in  polynomial  time. 

In  view  of  Schaefer’s  dichotomy  theorem  for  Boolean  satisfiability  problems, 
it  is  natural  to  ask  whether  similar  dichotomy  theorems  can  be  obtained  for  the 
family  MC-ClRC(5)  of  model  checking  problems  for  propositional  circumscrip¬ 
tion  and  the  family  Inf-Circ(5)  of  inference  problems  for  propositional  circum¬ 
scription,  where  5  is  a  finite  set  of  logical  relations.  At  first,  one  may  expect 
that,  if  such  dichotomy  theorems  hold,  then  the  boundary  of  the  dichotomy  will 
be  the  same  as  that  in  Schaefer’s  dichotomy  theorem.  In  particular,  one  may 
expect  that  if  5  is  a  non-Schaefer  set  of  logical  relations,  then  MC-ClRC(5) 
should  be  coNP-complete  and  iNF-ClRC(iS')  should  be  H^-complete.  Neverthe¬ 
less,  this  turns  out  to  be  a  rather  naive  expectation.  Indeed,  consider  the  set 
S  —  {i^i/a}  consisting  of  the  logical  relation  Rj/^  =  {(1,0,0),  (0, 1,0),  (0,0, 1)}. 
As  seen  earlier,  S  is  non-Schaefer  and  so  Sat(5)  is  NP-complete  (recall  that  in 
this  case  Sat(5)  is  Positive- 1-In-3-Sat).  It  is  easy  to  see,  however,  that  if  (p 
is  a  CNF(5)-formula,  then  every  satisfying  truth  assignment  of  is  a  minimal 
model  of  (p.  Consequently,  MC-ClRC(5)  is  in  P  (in  fact,  it  is  solvable  in  linear 
time)  and  Inf-Circ(5)  is  in  coNP  (in  fact,  it  is  coNP-complete). 

In  [KKOla],  the  following  dichotomy  theorem  was  established  for  the  family 
MC-Circ(5)  of  model  checking  problems  for  propositional  circumscription:  if  S 
is  a  finite  set  of  logical  relations,  then  either  MC-ClRC(S')  is  coNP-complete 
or  MC-Circ(jS)  is  in  P.  Furthermore,  in  [KKOlb],  the  following  dichotomy 
theorem  was  established  for  the  family  Inf- C IRC  (5)  of  inference  problems  for 
propositional  circumscription:  if  5  is  a  finite  set  of  logical  relations,  then  either 
Inf-Circ(5)  is  n2 -complete  or  Inf-Circ(5)  is  in  coNP.  It  was  also  shown  that 
the  boundaries  in  these  two  dichotomies  coincide,  but  differ  from  the  boundary 
in  Schaefer’s  dichotomy  theorem  for  Boolean  satisfiability  problems.  These  new 
dichotomy  theorems  were  proved  by  first  establishing  corresponding  dichotomy 
theorems  in  a  key  special  case  and  then  using  the  results  for  this  special  case  as 
a  stepping  stone  towards  the  full  dichotomy  theorems. 

Definition  9:  A  A:-ary  logical  relation  R  is  l-valid  if  it  contains  the  all-ones  k~ 
tuple  (1, . . . ,  1).  A  set  5  of  logical  relations  is  l-valid  if  every  relation  in  S 
is  l-valid.  □ 

For  example,  the  logical  relation  K  =  {(1, 1, 1),  (0, 1, 0),  (0,0, 1)}  is  l-valid. 
Note  that  the  set  S  =  {Rq,  Ri^  R2^  Rs}  in  Example  3  is  not  l-valid,  since  the 
relation  Rz  is  not  l-valid.  In  contrast,  the  set  P  —  {Rq,Ri,R2}  is  l-valid. 

We  now  have  all  the  prerequisites  to  state  our  dichotomy  theorems  for  the 
model  checking  problem  MC-CiRC(5')  and  the  inference  problem  Inf-Circ(5), 
when  5  is  a  l-valid  set  of  logical  relations.  In  this  case,  the  boundary  of  the 
dichotomies  coincides  with  the  boundary  in  Schaefer’s  dichotomy  theorem. 
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Theorem  10:  [KK01a,KK01b]  Let  S  he  a  l-valid  set  of  logical  relations. 

~  If  S  is  Schaefer,  then  MC-ClRC(5)  is  in  P;  otherwise,  it  is  coNP -complete. 

-  If  S  is  Schaefer,  then  iNF-ClRC(iS')  is  in  coNP;  otherwise,  it  is  -complete. 
Actually,  if  S  is  non-Schaefer,  then  even  the  following  case  of  Inf-Circ(iS) 
is  IL2 -complete:  given  a  CNF {S) -formula  (p  and  a  negative  literal  ~^u,  does 
^  HcIRC 

Moreover,  there  is  a  polynomial-time  algorithm  to  decide,  given  a  finite  1- 
valid  set  of  logical  relations,  whether  MC-CiRC(*S)  is  in  P  or  coNF -complete, 
and  also  whether  iNF-CiRC(iS')  is  in  coNP  or  U2 -complete. 

The  following  examples  illustrate  the  preceding  Theorem  10  and  provide  new 
instances  of  restricted  C2ises  of  the  model  checking  problems  and  the  inference 
problem  for  propositional  circumscription  having  the  same  inherent  complexity 
as  the  general  case. 

Example  11 :  Consider  again  the  logical  relation  K  =  {(1, 1, 1),  (0, 1,0), 
(0,0, 1)}.  Using  the  closure  properties  that  characterize  when  a  logical  relation 
is  bijunctive,  Horn,  dual  Horn,  or  affine,  it  is  easy  to  see  that  K  is  none  of  the 
above.  For  instance,  K  is  not  Horn  because  (0,1,0)  A  (0,0,1)  =  (0,0,0)  ^  K; 
similarly,  K  is  not  afl&ne  because  (1,1,1)  ©  (0,1,0)  ©  (0,0,1)  =  (1,0,0)  ^  K. 
Consequently,  Theorem  10  implies  that  MC-CiRCdif})  is  coNP-complete  and 
lNF-ClRC({i^})  is  Hs -complete.  Q 

Example  12:  Consider  the  l-valid  set  P  =  {i2oj  ^2}?  where,  as  seen  earlier, 
Rq  =  {0,  Ip  -  {(0,0,0)}  (expressing  the  clause  (x  V  y  V  z)),  Ri  =  {0,1}^  ~ 
{(1,0,0)}  (expressing  the  clause  (-<x  V  y  V  z)),  and  R2  =  {0,1}^  —  {(1,1,0)} 
(expressing  the  clause  (-ixV-iyVz)).  Thus,  the  class  of  CNF(P)-formulas  consists 
of  all  3CNF-formulas  that  do  not  contain  a  clause  of  the  form  {-^x  V  -»y  V  ->z). 
Using  the  closure  properties,  it  is  easy  to  verify  that  jRi  is  not  bijunctive,  Horn, 
or  affine,  and  that  R2  is  not  dual  Horn.  Consequently,  Theorem  10  implies  that 
MC-CiRC(P)  is  coNP-complete  and  Inf-Circ(P)  is  Hf -complete.  □ 

Theorem  10  can  be  used  as  a  stepping  stone  to  obtain  dichotomy  theorems 
for  the  family  of  all  MC-ClRC(5)  problems  and  the  family  of  all  Inf-Circ(5) 
problems,  where  S  is  an  arbitrary  set  of  logical  relations.  To  this  effect,  we  use 
the  following  crucial  concept,  which  was  first  introduced  in  [KKOla]. 

Definition  13:  Let  P  be  a  fc-ary  logical  relation.  We  say  that  a  logical  relation  T 
is  a  0-section  of  R  if  either  T  is  the  relation  R  itself  or  T  can  be  defined 
from  the  formula  R'{xi, . . .  ,Xk)  by  replacing  at  least  one,  but  not  all,  of  the 
variables  xi, . . . , by  0.  ^ 

To  illustrate  this  concept,  observe  that  the  l-valid  logical  relation  {(!)} 
is  a  0-section  of  R[^^  =  {(1, 0,0),  (0, 1,0),  (0, 0, 1)},  since  it  is  definable  by 
Pi/3(xi,0,0).  Note  that  the  logical  relation  {(1,0),  (0, 1)}  is  also  a  0-section 
of  Pi/3,  since  it  is  definable  by  the  formula  R[^^(0,X2,X3),  but  it  is  not  l-valid. 
In  fact,  it  is  easy  to  verify  that  {(1)}  is  the  only  logical  relation  that  is  both  1- 
valid  and  a  0-section  of  P1/3. 
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Theorem  14:  [KK01a,KK01b]  Let  S  be  a  set  of  logical  relations  and  let  S*  be 
the  set  of  all  logical  relations  T  such  that  T  is  both  1-valid  and  a  0-section  of 
some  relation  in  S. 

-  If  S*  is  Schaefer,  then  MC-ClRC(*S')  is  in  P;  otherwise,  it  is  coNP -complete. 

-  If  S*  is  Schaefer,  then  Inf-Circ(5)  is  in  coNP;  otherwise,  it  is  II2  -complete. 
Actually,  if  S*  is  non-Schaefer,  then  even  the  following  case  of  Inf-Circ(5) 
is  U2 -complete:  given  a  C'^F{S) -formula  ip  and  a  negative  literal  ->u,  does 

Ncirc  ? 

Moreover,  there  is  a  polynomial-time  algorithm  to  decide,  given  a  finite  1- 
valid  set  of  logical  relations,  whether  MC-ClRC(5)  is  in  P  or  coNF -complete, 
and  also  whether  Inf-Circ(S')  is  in  coNP  or  II2  -complete. 

We  now  present  several  different  examples  that  illustrate  the  power  of  The¬ 
orem  14.  The  first  example  shows  how  the  main  result  in  [EG93]  can  be  easily 
derived  from  Theorem  14. 

Example  15:  Let  S  =  {Ro,  Ri^  R2^  R3}  be  the  set  of  logical  relations  that  give 
rise  to  3-Sat.  Since  Rq^  Ri,  R2  are  1-valid  logical  relations,  they  are  members 
of  S*.  It  follows  that  S*  is  not  Schaefer,  since,  as  seen  earlier,  Ri  is  not  bijunctive, 
Horn  or  afiine,  and  R2  is  not  dual  Horn.  Theorem  14  immediately  implies  that 
Inf-Circ(5)  is  n2 -complete.  □ 

Example  16:  Consider  the  set  S  =  where  Rq  and  R3  are  as  in  the 

preceding  Example  15.  In  this  case,  Sat(5)  is  the  problem  MONOTONE  3-Sat, 
that  is  to  say,  the  restriction  of  3- Sat  to  3CNF-formulas  in  which  every  clause 
is  either  the  disjunction  of  positive  literals  or  the  disjunction  of  negative  literals. 
It  is  well  known  that  this  problem  is  NP-complete  (this  can  also  be  derived  from 
Schaefer’s  Dichotomy  Theorem  6).  It  is  not  hard  to  verify  that  every  relation 
in  S*  is  dual  Horn  (for  instance,  S*  contains  Rq,  which  is  dual  Horn).  Conse¬ 
quently,  Theorem  14  implies  that  MC-ClRC(5)  is  in  P  and  Inf-Circ(5)  is  in 
coNP.  □ 

The  preceding  example  shows  that  the  boundary  in  Schaefer’s  dichotomy  the¬ 
orem  for  Boolean  satisfiability  is  different  from  the  boundary  in  the  dichotomy 
theorem  for  the  model  checking  problem  and  the  inference  problem  in  proposi¬ 
tional  circumscription.  Our  final  example  provides  several  other  instances  of  this 
phenomenon. 

Example  17:  If  m  and  n  are  two  positive  integers  with  m  <  n,  then  Rm/n  is 
the  n-ary  logical  relation  consisting  of  all  n-tuples  that  have  m  ones  and  n  —  m 
zeros.  It  is  easy  to  see  that  Rm/n  is  not  Schaefer.  Consequently,  if  5  is  a  set  of 
logical  relations  each  of  which  is  of  the  form  Rm/n  for  some  m  and  n  with  m  <n, 
then  Sat{S)  is  NP-complete.  In  contrast,  S*  is  easily  seen  to  be  Horn  (and, 
hence,  Schaefer),  since  every  relation  T  in  S*  is  a  singleton  T  =  {(1,...,!)} 
consisting  of  the  m-ary  all-ones  tuple  for  some  m.  Consequently,  Theorem  14 
implies  that  MC-ClRC(5)  is  in  P  and  Inf-Circ(*S')  is  in  coNP. 
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This  family  of  examples  contains  Positive- I-In-3-S AT  as  the  special  CEise 
where  S  —  {R1/3}.  n 

Remark  18:  The  proofs  of  Theorem  10  and  Theorem  14  can  be  found  in 
[KK01a,KK01b].  They  make  use  of  the  aforementioned  Schaefer’s  expressibility 
theorem  (Theorem  7),  additional  specialized  expressibility  results,  and  a  series 
of  delicate  reductions  between  problems. 

It  should  be  pointed  out  that  the  problem  actually  studied  in  [KKOla]  is 
the  minimal  satisfiability  problem  Min  Sat(iS'),  which  is  the  complement  of 
MC-Circ(6'):  given  a  CNF(5)-formula  S  and  a  satisfying  truth  assignment  a, 
is  there  a  satisfying  truth  assignment  (p  such  that  j3  <  a?  Consequently, 
the  dichotomy  obtained  in  [KKOla]  is  a  dichotomy  between  NP-completeness 
vs.  membership  in  P,  and  clearly  implies  the  dichotomy  for  MC-ClRC(5). 

Note  that  here  the  logical  constants  0  and  1  were  allowed  in  the  construc¬ 
tion  of  CNF(5')-formulas.  Schaefer  [Sch78]  also  obtained  a  dichotomy  theorem 
for  the  satisfiability  problem  Sat(5),  when  restricted  to  CNF(5)-formulas  with¬ 
out  constants;  this  result  requires  the  deployment  of  additional  technical  machin¬ 
ery.  Dichotomy  theorems  for  MC-ClRC(5)  and  Inf-Circ(5),  when  restricted  to 
CNF(5)  formulas  without  constants  were  also  obtained  in  [KKOla, KKOlbj.  □ 

Remark  19:  The  dichotomy  theorem  for  the  family  Inf-Circ(5)  can  be  in¬ 
terpreted  as  asserting  that,  for  every  finite  set  S  of  logical  relations,  either 
Inf-Circ(5)  is  as  hard  as  the  full  inference  problem  for  propositional  circum¬ 
scription  (112 -complete)  Inf-Circ(<S')  is  no  harder  than  the  inference  problem 
for  ordinary  propositional  logic  (since  the  latter  is  coNP-complete). 

It  should  be  noted  that  researchers  in  computational  complexity  have  isolated 
and  studied  several  interesting  complexity  classes  between  coNP  and  II^ ,  each 
with  its  own  distinctive  complete  problems,  such  as  the  class  DP  of  problems 
that  are  conjunctions  of  NP  and  coNP  predicates.  In  fact,  an  entire  hierarchy 
of  complexity  classes,  known  as  the  Boolean  Hierarchy  BH,  is  sandwiched  be¬ 
tween  coNP  and  H^  (see  [Joh90]).  Thus,  our  dichotomy  theorem  for  the  family 
Inf-Circ(5)  reveals  a  dramatic  gap  in  the  complexity  of  the  inference  prob¬ 
lem  for  propositional  circumscription  between  sets  S  of  logical  relations  that  are 
Schaefer  and  those  that  are  non-Schaefer.  □ 

4  Open  Problems 

The  dichotomy  theorem  for  the  family  Inf-Circ(5),  where  5  is  a  finite  set  of 
logical  relations,  characterizes  the  “truly  hard”  (H^-complete)  cases  of  the  in¬ 
ference  problem  for  propositional  circumscription  in  Schaefer’s  framework,  but 
leaves  open  the  possibility  that  further  distinctions  can  be  made  between  the 
“easier”  cases  of  this  problem.  To  this  effect,  we  conjecture  that  a  trichotomy 
theorem  holds  for  the  family  Inf-Circ(5).  Specifically,  we  conjecture  that,  for 
every  finite  set  S  of  logical  relations,  exactly  one  of  the  following  three  alterna¬ 
tives  holds: 
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1.  Inf-Circ(5)  is  112 -complete; 

2.  Inf-Circ(5)  is  coNP-complete; 

3.  Inf-Circ(5)  is  in  P. 

In  view  of  the  results  described  here,  it  remains  to  show  that  a  dichotomy  theo¬ 
rem  holds  for  Inf-Circ(5),  when  5  is  a  Schaefer  set  of  logical  relations.  Although 
partial  results  in  this  direction  have  been  obtained  in  [CL94] ,  much  more  remains 
to  be  done.  In  particular,  the  exact  complexity  of  iNF-ClRC(iS')  is  not  known, 
when  5  is  an  affine  set  of  logical  relations. 

All  dichotomy  theorems  described  here  are  rather  special  to  Boolean  logic. 
Schaefer  [Sch78]  raised  the  problem  of  establishing  dichotomy  theorems  for  sat¬ 
isfiability  problems  over  domains  with  at  least  three  elements,  i.e.,  dichotomy 
theorems  for  many- valued  propositional  logic.  This  problem  remains  open  to 
date  with  no  solution  in  sight,  even  for  the  case  of  3-valued  logic. 
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Abstract,  Data  integration  is  the  problem  of  combining  the  data  re¬ 
siding  at  different  sources,  and  providing  a  unified  view  of  these  data, 
called  global  schema,  which  can  be  queried  by  the  user.  The  interest  in 
this  kind  of  systems  has  been  continuously  growing  in  the  last  years. 
However,  the  design  of  a  data  integration  system  is  a  very  complex  task, 
and  several  issues  remains  open,  including  how  to  express  the  relation 
between  the  global  schema  and  the  sources,  and  how  to  process  queries 
expressed  on  the  global  schema.  In  this  paper  we  deal  with  these  two 
problems,  by  presenting  a  logical  framework  for  data  integration,  and 
by  discussing  the  various  choices  for  both  the  specification  of  a  data 
integration  system,  and  the  design  of  query  answering  methods.  Also, 
we  elaborate  on  the  observation  that,  in  real  world  scenarios,  the  case 
of  mutually  inconsistent  local  databases  will  be  very  common,  and  we 
present  the  basic  ideas  in  order  to  extend  the  integration  framework  with 
suitable  nonmonotonic  reasoning  features  for  dealing  with  this  case. 


1  Introduction 

Data  integration  is  the  problem  of  combining  the  data  residing  at  different 
sources,  and  providing  the  user  with  a  unified  view  of  these  data,  called  global 
schema,  or  global  schema.  The  global  schema  is  therefore  a  reconciled  view  of 
the  information,  which  can  be  queried  by  the  user.  It  is  the  task  of  the  data 
integration  system  to  free  the  user  from  the  knowledge  on  where  data  are,  and 
how  data  are  structured  at  the  sources. 

The  interest  in  this  kind  of  systems  has  been  continuously  growing  in  the 
last  years.  However,  the  design  of  a  data  integration  system  is  a  very  complex 
task,  and  several  issues  remains  open.  Two  main  problems  complicate  the  task: 

1.  How  to  express  the  relation  between  the  global  schema  and  the  sources, 

2.  How  to  process  queries  expressed  on  the  global  schema. 

With  regard  to  Problem  (1),  two  basic  approaches  have  been  used  to  specify 
the  relation  between  the  sources  and  the  global  schema.  The  first  approach,  called 
global- as-view  (or  query-based),  requires  that  the  global  schema  is  expressed  in 
terms  of  the  data  sources.  More  precisely,  to  every  concept  of  the  global  schema, 
a  view  over  the  data  sources  is  associated,  so  that  its  meaning  is  specified  in 
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terms  of  the  data  residing  at  the  sources.  The  second  approach,  called  local-as- 
view  (or  source-based),  requires  the  global  schema  to  be  specified  independently 
from  the  sources.  The  relationships  between  the  global  schema  and  the  sources 
are  established  by  defining  every  source  as  a  view  over  the  global  schema.  Thus, 
in  the  local-as-view  approach,  we  specify  the  meaning  of  the  sources  in  terms  of 
the  concepts  in  the  global  schema.  It  is  clear  that  the  latter  approach  favors  the 
extensibility  of  the  integration  system,  and  provides  a  more  appropriate  setting 
for  its  maintenance.  For  example,  adding  a  new  source  to  the  system  requires 
only  to  provide  the  definition  of  the  source,  and  does  not  necessarily  involve 
changes  in  the  global  schema.  On  the  contrary,  in  the  global-as-view  approach, 
adding  a  new  source  typically  requires  changing  the  definition  of  the  concepts  in 
the  global  schema.  A  comparison  between  the  two  approaches  is  reported  in  [20]. 

Problem  (2)  is  concerned  with  the  choice  of  the  method  for  computing  the 
answer  to  queries  posed  in  terms  of  the  global  schema.  While  query  answering 
in  the  global-as-view  approach  typically  reduces  to  unfolding,  an  integration 
system  based  on  the  local-as-view  approach  must  resort  to  more  sophisticated 
query  processing  techniques.  The  main  issue  is  that  the  system  should  be  able 
to  reason  on  the  mapping  so  as  to  re-express  the  query  in  terms  of  a  suitable 
set  of  queries  posed  to  the  sources.  In  this  reformulation  process,  the  crucial 
step  is  deciding  how  to  decompose  the  query  on  the  global  schema  into  a  set  of 
subqueries  on  the  sources,  based  on  the  meaning  of  the  sources  in  terms  of  the 
concepts  in  the  global  schema.  The  computed  sub  queries  are  then  shipped  to 
the  sources,  and  the  results  axe  assembled  into  the  final  answer. 

Independently  on  the  method  used  for  the  specification  of  the  mapping  be¬ 
tween  the  global  schema  and  the  source  schemas,  it  is  easy  to  see  that  query 
processing  in  data  integration  is  related  to  query  answering  using  views.  In  turn, 
query  answering  using  views  can  be  seen  as  a  form  of  reasoning  with  incomplete 
information.  The  interested  reader  is  referred  to  [21]  for  a  survey  on  this  subject. 
Query  answering  using  views  has  been  investigated  in  the  last  years  in  the  con¬ 
text  of  simplified  frameworks.  In  [16,18],  the  problem  has  been  studied  for  the 
case  of  conjunctive  queries  (with  or  without  arithmetic  comparisons),  in  [2]  for 
disjunctive  views,  in  [19,10,13]  for  queries  with  aggregates,  in  [11]  for  recursive 
queries  and  nonrecursive  views,  and  in  [6,7]  for  several  variants  of  regular  path 
queries.  Comprehensive  frameworks  for  view-based  query  answering,  as  well  as 
several  interesting  results  for  various  query  languages,  are  presented  in  [12,1]. 

Query  answering  using  views  is  also  tightly  related  to  query  rewriting 
[16,11,20].  In  general,  a  rewriting  of  a  query  with  respect  to  a  set  of  views  is 
a  function  that,  given  the  extensions  of  the  views,  returns  a  set  of  tuples  that  is 
contained  in  the  answer  set  of  the  query  with  respect  to  the  views.  Usually,  one 
fixes  a  priori  the  language  in  which  to  express  rewritings  (e.g.,  unions  of  con¬ 
junctive  queries),  and  then  looks  for  the  best  possible  rewriting  expressible  in 
such  a  language.  On  the  other  hand,  we  may  call  perfect  a  rewriting  that  returns 
exactly  the  answer  set  of  the  query  with  respect  to  the  views,  independently 
of  the  language  in  which  it  is  expressed.  Hence,  if  an  algorithm  for  answering 
queries  using  views  exists,  it  can  be  viewed  as  a  perfect  rewriting  [8,9]. 
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In  this  paper,  we  present  a  logical  framework  for  data  integration,  and  we  dis¬ 
cuss  the  various  choices  for  both  the  specification  of  a  data  integration  systems, 
and  the  design  of  query  answering  methods.  Also,  we  elaborate  on  the  obser¬ 
vation  that,  in  real  world  scenarios,  the  case  of  mutually  inconsistent  source 
databases  will  be  very  common,  and  we  present  the  basic  ideas  in  order  to  ex¬ 
tend  the  integration  framework  with  suitable  nonmonotonic  reasoning  features 
for  dealing  with  this  case. 

The  paper  is  organized  as  follows.  In  the  next  section  we  set  up  a  formal 
framework  for  data  integration,  based  on  first  order  logic.  In  Section  3,  we  discuss 
three  basic  means  for  specifying  the  mapping  between  the  global  schema  and 
the  source  schemas.  In  Section  4  we  extend  the  framework  in  order  to  cope  with 
the  problem  of  integrating  incoherent  source  databases.  Section  5  concludes  the 
paper. 

2  Framework 

In  this  section  we  set  up  a  formal  framework  for  data  integration  systems  (DISs). 
In  what  follows,  one  of  the  main  aspects  is  the  definition  of  the  semantics  of  both 
the  DIS,  and  of  queries  posed  to  the  global  schema.  For  keeping  things  simple, 
we  will  use  in  the  following  a  unique  semantic  domain  composed  of  a  fixed, 
infinite  set  of  symbols. 

Formally,  a  DIS  X>  is  a  triple  where  Q  is  the  global  schema,  S 

is  the  set  of  source  schemas,  and  Mg,s  is  the  mapping  between  G  and  the  source 
schemas  in  S. 

We  denote  with  Ag  the  alphabet  of  terms  of  the  global  schema^  and  we  assume 
that  the  global  schema  ^  of  a  DIS  is  expressed  as  a  theory  (named  simply  Q)  in 
a  logic  Cg‘ 

We  assume  to  have  a  set  <S  of  n  source  schemas  <Si, . . . ,  5n.  We  denote  with 
Asi  the  alphabet  of  terms  of  the  source  schema  Si.  We  also  denote  with  As  the 
union  of  all  the  ^5^  ’s.  We  assume  that  the  various  Asi ’s  are  mutually  disjoint, 
and  each  one  is  disjoint  from  the  alphabet  Ag.  We  assume  that  each  source 
schema  is  expressed  as  a  theory  (named  simply  5i)  in  a  logic  £5.,  and  we  use  S 
to  denote  the  collection  of  theories  5i, . . .  ,<Sn. 

The  mapping  Mg,s  is  the  heart  of  the  DIS,  in  that  it  specifies  how  the 
concepts^  in  the  global  schema  and  in  the  source  schemas  map  to  each  other. 
We  discuss  this  aspect  more  deeply  in  the  next  section.  Here,  we  simply  assume 
that  Mg^s  is  an  appropriate  specification  of  how  the  concepts  in  the  various 
schemas  map  to  each  other. 

Intuitively,  in  specifying  the  semantics  of  a  DIS,  we  have  to  start  with  a  model 
of  the  source  schemas,  and  the  crucial  point  is  to  specify  which  are  the  models  of 
the  global  schema.  Thus,  for  assigning  semantics  to  a  DIS  J  =  {Q^S^Mg^s)-,  we 
start  by  considering  a  source  model  B  for  V,  i.e.,  an  interpretation  that  is  a  model 

^  Here  and  below  we  use  the  term  “concept”  for  denoting  a  concept  of  the  schema, 
which  in  turn  can  be  represented  either  by  a  class  or  by  a  relation  (not  necessarily 
atomic)  in  logic. 
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for  all  the  theories  of  S.  We  call  global  interpretation  for  T>  any  interpretation 
for  A  global  interpretation  1  for  T>  is  said  to  be  a  global  model  for  V  wrt  B  if: 

—  X  is  a  model  of 

—  X  satisfies  the  mapping  M.g^s  wrt  B. 

In  the  next  section,  we  will  come  back  to  the  notion  of  satisfying  a  mapping  wrt 
a  source  model.  The  semantics  of  X>,  denoted  sem{V)^  is  defined  as  follows: 

sem{V)  =  {  X  I  there  exists  a  source  model  B  for  V 
s.t.  X  is  a  global  model  for  V  wrt  B  } 

Queries  posed  to  a  DIS  X>  are  expressed  in  terms  of  a  query  language  Qg 
over  the  alphabet  Ag  and  are  intended  to  extract  a  set  of  tuples  of  elements  of 
A.  Thus,  every  query  has  an  associated  arity,  and  the  semantics  of  a  query  q  of 
arity  n  is  defined  as  follows.  The  answer  q'^  oi  q  to  V  is  the  set  of  tuples 

=  {(ci, ...,  Cn)  I  for  all  X  e  5em(I>),  (ci, . . . ,  c„)  G  } 
where  q^  denotes  the  result  of  evaluating  q  in  the  interpretation  X, 

3  Specifying  the  Mapping 

As  we  said  before,  the  mapping  Mg^s  represents  the  heart  of  a  DIS  V  = 
(Qi  S,  -Mg, 5),  and  allow  for  mapping  a  concept  in  one  schema  into  a  view,  i.e.,  a 
query,  over  the  other  schemas.  In  this  section  we  discuss  the  various  ways  that 
one  can  use  for  specifying  the  mapping.  The  terminology  used  in  this  section  is 
inspired  by  [15,14].  In  our  analysis,  we  will  concentrate  on  mapping  with  “sound” 
views.  More  general  kinds  of  mappings  are  discussed  in  [4]. 

3.1  Global-Centric  Approach 

In  the  global-centric  approach  (aka  global- as- view  approach),  we  assume  we  have 
a  query  language  Vs  over  the  alphabet  A5,  and  the  mapping  between  the  global 
and  the  source  schemas  is  given  by  associating  to  each  term  in  the  global  schema 
a  view,  i.e.,  a  query,  over  the  sources.  The  intended  meaning  of  associating  to  a 
term  C  in  ^  a  query  Vs  over  S,  is  that  such  a  query  represents  the  best  way  to 
characterize  the  instances  of  C  using  the  concepts  in  S.  Let  Bhe  a  source  model 
for  V,  and  X  a  global  interpretation  for  T>.  Then  X  satisfies  the  pair  {C,  14)  in 
Mg^s  wrt  B,  if  all  the  tuples  satisfying  Vs  in  V  satisfy  C  in  X.  We  say  that  X 
satisfies  the  mapping  M.g^s  wrt  B,  if  X  satisfies  every  pair  in  A4g^s  wrt  B. 

The  global-centric  approach  is  the  one  adopted  in  most  data  integration 
systems.  It  is  a  common  opinion  that  this  mechanism  allow  for  a  simple  query 
processing  strategy,  which  basically  reduces  to  unfolding  the  query  using  the 
definition  specified  in  the  mapping,  so  as  to  translate  the  query  in  terms  of 
accesses  to  the  sources  [20].  Recently,  we  have  showed  that  in  the  case  where 
we  add  constraints  (even  of  a  very  simple  form)  to  the  global  schema,  query 
processing  becomes  harder,  due  to  the  need  of  dealing  with  a  form  of  incomplete 
information. 
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3.2  Source- Centric  Approach 

In  the  source-centric  approach  (aka  local-as-view  approach),  we  assume  we  have 
a  query  language  Vg  over  the  alphabet  Ag,  and  the  mapping  between  the  global 
and  the  source  schemas  is  given  by  associating  to  each  term  in  the  source  schemas 
a  view,  i.e.  a  query,  over  the  global  schema.  Again,  the  intended  meaning  of 
associating  to  a  term  C  in  a  query  Vg  over  5,  is  that  such  query  represents 
the  best  way  to  characterize  the  instances  of  C  using  the  concepts  in  Q.  Let  B 
be  a  source  model  for  V,  and  J  a  global  interpretation  for  V.  Then  X  satisfies 
the  pair  {Vg,  C)  in  Mg^s  wrt  B,  if  all  the  tuples  satisfying  C  mV  satisfy  Vg  in 
J.  As  in  the  global- centric  approach,  we  say  that  X  satisfies  the  mapping  Mg^s 
wrt  B,iiX  satisfies  every  pair  in  Ad g, 5  wrt  B. 

Recent  work  on  data  integration  follows  the  source-centric  approach  [17,5,3]. 
The  major  challenge  of  this  approach  is  that  in  order  to  answer  a  query  expressed 
over  the  global  schema,  one  must  be  able  to  reformulate  the  query  in  terms  of 
queries  to  the  sources.  While  in  the  global-centric  approach  such  a  reformulation 
is  guided  by  the  definitions  in  the  mapping,  here  the  problem  requires  a  reasoning 
step,  so  as  to  infer  how  to  use  the  sources  for  answering  the  query  [8,3].  Many 
authors  point  out  that,  despite  its  difficulty,  the  source-centric  approach  better 
supports  a  dynamic  environment,  where  source  schemas  can  be  added  to  the 
systems  without  the  need  of  restructuring  the  global  schema. 

3.3  Unrestricted  Mapping 

In  the  unrestricted  approach,  we  have  both  a  query  language  V5  over  the  al¬ 
phabet  As,  and  a  query  language  Vg  over  the  alphabet  Ag,  and  the  mapping 
between  the  global  and  the  source  schemas  is  given  by  relating  views  over  the 
global  schema  to  views  over  the  source  schemas.  Again,  the  intended  meaning  of 
relating  the  view  Vg  over  the  global  schema  to  the  view  Vs  over  the  source  schema 
is  that  Vs  represents  the  best  way  to  characterize  the  objects  satisfying  Vg  in 
terms  of  the  concepts  in  «S.  In  other  words,  in  the  unrestricted  approach  we  try 
to  combine  and  extend  the  representation  power  of  the  previous  approaches.  Let 
be  a  source  model  for  V,  and  X  a  global  interpretation  for  V.  Then  X  satisfies 
the  pair  (Vg,Vs)  in  Mg^s  wrt  B,  if  all  the  tuples  satisfying  satisfying  Vs  in  V 
satisfy  Vg  in  X.  Again,  we  say  that  X  satisfies  the  mapping  Mg,s  wrt  B,  if  X 
satisfies  every  pair  in  Mg^s  wrt  B. 

This  approach  is  largely  unexplored,  mainly  because  it  combines  the  difficul¬ 
ties  of  the  other  ones.  However,  we  argue  that,  in  real  world  settings,  this  is  the 
only  approach  that  provides  the  appropriate  expressive  power. 

4  Beyond  First-Order  Logic 

According  to  our  definition  of  a  DIS  V,  it  is  easy  to  see  that  it  may  happen  that 
no  global  model  for  V  exists,  even  when  at  least  one  source  model  for  V  exists. 
This  may  happen  because  knowledge  in  the  various  source  schemas  cannot  be 
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completely  reconciled  in  the  global  schema.  In  the  formalization  presented  in 
the  previous  sections,  this  situation  gives  rises  to  an  inconsistent  DIS  V  (i.e., 
sem{'D)  =  0),  which  cannot  support  query  processing. 

A  more  general  approach  would  be  to  provide  a  formalization  that  is  able 
to  support  query  processing  even  when  the  source  schemas  to  be  integrated  are 
mutually  incoherent.  Here,  we  present  a  preliminary  proposal  aiming  at  this  goal. 

The  basic  idea  is  that  given  a  DIS  V  —  (Q,S^Mg,s)  and  a  source  model  B 
for  P,  we  would  like  to  focus  our  attention  on  those  global  interpretations  J  that 
are  models  of  the  global  schema  Q  and  that  approximate  as  much  as  possible  the 
satisfaction  relation  for  the  mapping  Mg^s-  One  way  to  formalize  this  idea  is 
to  distinguish  between  strict  mappings,  as  the  ones  considered  in  Section  3,  and 
loose  mappings.  In  particular,  we  add  to  every  pair  in  Aig^s  a  new  item,  which  is 
either  strict,  or  loose,  and  then  we  define  an  ordering  wrt  B  between  the  models 
of  Q.  We  concentrate  directly  on  the  most  general  case  of  unrestricted  mapping. 

If  Ji  and  J2  are  two  models  of  Q,  we  say  that  Ji  is  better  than  X2  wrt 
B,  denoted  as  Ji  >5  I2,  iff  for  all  triples  G  Mg,s,  except  for  a 

distinguished  one  {Vg,  V^,  loose),  we  have  that  Vg^  =  Vg^  and  =  V^‘, 

while  for  the  distinguished  triple  {Vg,Vl,x')  we  have  that  —  V^, 

and  there  exists  a  tuple  t  G  such  that  t  G  Vg^^  and  t  ^  It  is  easy  to 
verify  that  the  relation  is  a  partial  order.  With  this  notion  in  place  we  define 
global  models  for  V  wrt  B  those  models  X  oiG  that  are  maximal  wrt  >23,  i.e., 
for  no  other  model  T  of  G,  T  X. 

5  Conclusions 

We  have  illustrated  a  logic-based  framework  for  data  integration,  and  we  have 
discussed  several  choices  for  the  specification  of  the  mapping  between  the  global 
schema  and  the  source  schemas.  The  form  of  such  a  specification  greatly  in¬ 
fluences  the  method  for  query  answering.  As  we  said  before,  most  of  the  re¬ 
search  work  on  data  integration  are  based  on  first-order  logic,  following  either 
the  global-centric  or  the  local-centric  approach.  However,  it  is  our  opinion  that, 
in  real  world  settings,  the  case  of  mutually  inconsistent  source  databases  will  be 
very  common.  We  have  presented  some  preliminary  ideas  in  order  to  extend  the 
integration  framework  with  suitable  nonmonotonic  reasoning  features  for  dealing 
with  this  case,  and  we  plan  to  study  query  processing  strategies  based  on  these 
ideas. 
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Abstract.  Nonmonotonic  logic  programming  (NMLP)  and  inductive 
logic  programming  (ILP)  are  two  important  extensions  of  logic  program¬ 
ming.  The  former  aims  at  representing  incomplete  knowledge  and  reason¬ 
ing  with  commonsense,  while  the  latter  targets  the  problem  of  inductive 
construction  of  a  general  theory  from  examples  and  background  knowl¬ 
edge.  NMLP  and  ILP  thus  have  seemingly  different  motivations  and 
goals,  but  they  have  much  in  common  in  the  background  of  problems, 
and  techniques  developed  in  each  field  are  related  to  one  another.  This 
paper  presents  techniques  for  combining  these  two  fields  of  logic  pro¬ 
gramming  in  the  context  of  nonmonotonic  inductive  logic  programming 
(NMILP).  We  review  recent  results  and  problems  to  realize  NMILP. 

1  Introduction 

Representing  knowledge  in  computational  logic  gives  formal  foundations  of  ar¬ 
tificial  intelligence  (AI)  and  provides  computational  methods  for  solving  prob¬ 
lems.  Logic  programming  supplies  a  powerful  tool  for  representing  declarative 
knowledge  and  computing  logical  inference.  However,  logic  programming  based 
on  classical  Horn  logic  is  not  sufficiently  expressive  for  representing  incomplete 
human  knowledge,  and  is  inadequate  for  characterizing  nonmonotonic  common- 
sense  reasoning.  Nonmonotonic  logic  programming  (NMLP)  [3,5]  is  introduced 
to  overcome  such  limitations  of  Horn  logic  programming  by  extending  the  rep¬ 
resentation  language  and  enhancing  the  inference  mechanism.  The  purpose  of 
NMLP  is  to  represent  incomplete  knowledge  and  reason  with  commonsense  in  a 
program. 

On  the  other  hand,  machine  learning  concerns  with  the  problem  of  building 
computer  programs  that  automatically  construct  new  knowledge  and  improve 
with  experience  [27].  The  primary  inference  used  in  learning  is  induction  which 
constructs  general  sentences  from  input  examples.  Inductive  Logic  Programming 
(ILP)  [28,30,33]  realizes  inductive  machine  learning  in  logic  programming,  which 
provides  a  formal  background  to  inductive  learning  and  has  advantages  of  us¬ 
ing  computational  tools  developed  in  logic  programming.  The  goal  of  ILP  is  the 
inductive  construction  of  first-order  clausal  theories  from  examples  and  back¬ 
ground  knowledge. 

NMLP  and  ILP  thus  have  seemingly  different  motivations  and  goals,  how¬ 
ever,  they  have  much  in  common  in  the  background  of  problems,  and  techniques 
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developed  in  each  field  are  related  to  one  another.  First,  the  process  of  discov¬ 
ering  new  knowledge  by  humans  is  the  iteration  of  hypotheses  generation  and 
revision,  which  is  inherently  nonmonotonic.  Indeed,  induction  is  nonmonotonic 
reasoning  in  the  sense  that  once  induced  hypotheses  might  be  changed  by  the 
introduction  of  new  evidences.  Second,  induction  problems  assume  background 
knowledge  which  is  incomplete,  otherwise  there  is  no  need  to  learn.  Therefore, 
representing  and  reasoning  with  incomplete  knowledge  are  vital  issues  in  ILP. 
Third,  NMLP  uses  hypotheses  in  the  process  of  commonsense  reasoning,  and 
hypotheses  generation  is  particularly  important  in  abductive  logic  programming. 
Abduction  generates  hypotheses  in  a  different  manner  from  induction,  but  they 
are  both  inverse  deduction  and  extend  theories  to  account  for  evidences.  In¬ 
deed,  abduction  and  induction  interact,  and  work  complement arily  in  many 
phases  [14].  Fourth,  in  NMLP  updates  of  general  rules  are  considered  in  the 
context  of  intentional  knowledge  base  update  [6],  while  a  similar  problem  is  cap¬ 
tured  in  ILP  as  concept-learning  [26].  It  is  argued  in  [9]  that  these  two  researches 
handle  the  same  problem  when  formulated  in  a  logical  framework.  With  these 
reasons,  it  is  clear  that  both  NMLP  and  ILP  cope  with  similar  problems  and 
have  close  links  to  each  other. 

Comparing  NMLP  and  ILP,  NMLP  performs  default  reasoning  and  derives 
plausible  conclusions  from  incomplete  knowledge  bases.  Various  types  of  infer¬ 
ences  and  semantics  axe  introduced  to  extract  intuitive  conclusions  from  a  pro¬ 
gram.  NMLP  may  change  conclusions  by  the  introduction  of  new  information, 
but  it  has  no  mechanism  of  learning  new  knowledge  from  the  input.  By  con¬ 
trast,  ILP  extends  a  theory  by  constructing  new  rules  from  input  examples  and 
background  knowledge.  Discovered  rules  reveal  hidden  laws  between  examples 
and  background  knowledge,  and  are  also  used  for  predicting  unseen  phenom¬ 
ena.  However,  the  present  ILP  mostly  considers  Horn  logic  programs  or  classical 
clausal  programs  as  background  knowledge,  and  has  limited  applications  to  non¬ 
monotonic  situations. 

Thus,  both  NMLP  and  ILP  have  limitations  in  their  present  frameworks  and 
complement  each  other.  Since  both  commonsense  reasoning  and  machine  learn¬ 
ing  are  indispensable  for  realizing  intelligent  information  systems,  combining 
techniques  of  the  two  fields  in  the  context  of  nonmonotonic  inductive  logic  pro¬ 
gramming  (NMILP)  is  meaningful  and  important.  Such  combination  will  extend 
the  representation  language  on  the  ILP  side,  while  it  will  introduce  a  learning 
mechanism  to  programs  on  the  NMLP  side.  Moreover,  linking  different  exten¬ 
sions  of  logic  programming  will  strengthen  the  capability  of  logic  programming 
as  a  knowledge  representation  tool  in  AI.  Prom  the  practical  viewpoint,  the  com¬ 
bination  will  be  beneficial  for  ILP  to  use  well-established  techniques  in  NMLP, 
and  will  open  new  applications  of  NMLP. 

NMLP  realizes  nonmonotonic  reasoning  using  negation  as  failure  (NAF). 
Some  researches  in  ILP,  however,  argue  that  negation  as  failure  is  inappropriate 
in  machine  learning.  In  [8],  the  authors  say: 

For  concept  learning,  negation  as  failure  (and  the  underlying  closed  world 

assumption)  is  unacceptable  because  it  acts  as  if  everything  is  known. 
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Clearly,  in  learning  this  is  not  the  case,  since  otherwise  nothing  ought  to 

be  learned. 

Although  the  account  is  plausible,  it  does  not  justify  excluding  NAF  in  ILP. 
Suppose  that  background  knowledge  is  given  as  a  Horn  logic  program,  and  the 
CWA  or  NAF  infers  negative  facts  which  are  not  derived  from  the  program. 
When  a  new  evidence  E  which  is  initially  assumed  false  under  the  CWA  or  NAF 
is  observed,  this  just  means  that  the  old  assumption  is  rebutted.  The  task  of 
inductive  learning  is  then  to  revise  the  old  theory  to  explain  the  new  evidence.  On 
the  other  hand,  if  one  excludes  NAF  in  a  background  program,  she  loses  the  way 
of  representing  default  negation  in  the  program.  This  is  a  significant  drawback  in 
representing  knowledge  and  restricts  the  application  of  ILP.  In  fact,  NAF  enables 
to  write  shorter  and  simpler  programs  and  appears  in  many  basic  but  practical 
Prolog  programs  such  as  computing  set  differences,  finding  union/intersection  of 
two  lists,  etc  [42].  Horn  ILP  precludes  every  program  including  these  rules  with 
NAF.  Thus,  NAF  is  also  important  in  ILP,  and  the  use  of  NAF  never  invalidates 
the  need  of  learning. 

In  the  field  of  ILP,  it  is  often  considered  the  so-called  nonmonotonic  problem 
setting  [18].  Given  a  background  Horn  logic  program  P  and  a  set  E  of  positive 
examples,  it  computes  a  hypothesis  H  which  is  satisfied  in  the  least  Herbrand 
model  oi  P\JE.  This  is  also  called  the  weak  setting  of  ILP  [11].  In  this  setting, 
any  fact  which  is  not  derived  from  PuE  is  assumed  to  be  false  under  the  closed 
world  assumption  (CWA).  By  contrast,  the  strong  setting  of  ILP  computes  a 
hypothesis  H  which,  together  with  P,  implies  P,  and  does  not  imply  negative 
examples.  The  strong  setting  is  usually  employed  in  ILP  and  is  also  considered 
in  this  paper  (see  Section  2.2).^  The  nonmonotonic  setting  is  called  “nonmono¬ 
tonic”  in  the  sense  that  it  performs  a  kind  of  default  reasoning  based  on  the 
closed  world  assumption.  Some  systems  take  similar  approaches  using  Clark’s 
completion  ([10],  for  instance).  The  above  mentioned  nonmonotonic  setting  is 
clearly  different  from  our  problem  setting.  The  former  still  considers  an  induc¬ 
tion  problem  within  clausal  logic,  while  we  extend  the  problem  to  nonmonotonic 
logic  programs. 

This  paper  presents  techniques  for  realizing  inductive  machine  learning  in 
nonmonotonic  logic  programs.  The  paper  is  not  intended  to  provide  a  compre¬ 
hensive  survey  of  the  state  of  the  art,  but  mainly  consists  of  recent  research 
results  of  the  author.  The  rest  of  this  paper  is  organized  as  follows.  Section  2 
reviews  frameworks  of  NMLP  and  ILP.  Section  3  presents  various  techniques  for 
induction  in  nonmonotonic  logic  programs.  Section  4  summarizes  the  paper  and 
addresses  open  issues. 

^  The  weak  setting  is  also  called  descriptive/confirmatory  induction,  while  the  strong 
setting  is  called  explanatory /predictive  induction  [15]. 
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2  Preliminaries 

2.1  Nonmonotonic  Logic  Programming 

Nonmonotonic  logic  programs  considered  in  this  paper  are  normal  logic  pro¬ 
grams,  logic  programs  with  negation  as  failure. 

A  normal  logic  program  (NLP)  is  a  set  of  rules  of  the  form: 

A  ^  Bi  notBm+u  not  Bn  (1) 

where  each  A,  Bi  {I  <  i  <  n)  is  an  atom  and  not  presents  negation  as  failure 
(NAF).  The  left-hand  side  of  is  the  head,  and  the  right-hand  side  is  the 
body  of  the  rule.  The  conjunction  in  the  body  of  (1)  is  identified  with  the  set 
{Bi,...,  Bm,  not  Bm-\-i^  •  •  • ,  not  Bn  }.  For  a  rule  R,  head{R)  and  body{R)  denote 
the  head  of  R  and  the  body  of  R,  respectively.  The  conjunction  in  the  body  is 
often  written  by  the  Greek  letter  F.  A  rule  with  the  empty  body  A  is  called  a 
fact,  which  is  identified  with  the  atom  A.  A  rule  with  the  empty  head  <r-  F  with 
r*  ^  0  is  also  called  an  integrity  constraint  Throughout  the  paper  a  program 
means  a  normal  logic  program  unless  stated  otherwise.  A  program  P  is  Horn 
if  no  rule  in  P  contains  NAF.  A  Horn  program  is  definite  if  it  contains  no 
integrity  constraint.  The  Herbrand  base  HB  of  a  program  P  is  the  set  of  all 
ground  atoms  in  the  language  of  P.  Given  the  Herbrand  base  HB,  we  define 
HB'^  —  HB  U  {  not  A  \  A  G  HB  }.  Any  element  in  HB'^  is  called  an  LP-literal, 
and  an  LP-literal  of  the  form  not  A  is  called  an  NAF-literal.  We  say  that  two 
LP-literals  Li  and  L2  have  the  same  sign  if  either  (Pi  G  HB  and  L2  €  HB)  or 
(Pi  ^  HB  and  P2  ^  HB).  For  an  LP-literal  L,  pred{L)  denotes  the  predicate 
in  P  and  const{L)  denotes  the  set  of  constants  appearing  in  P.  A  program,  a  rule, 
or  an  LP-literal  is  ground  if  it  contains  no  variable.  A  program/rule  containing 
variables  is  semantically  identified  with  its  ground  instantiation,  i.e.,  the  set 
of  ground  rules  obtained  from  the  program/rule  by  substituting  variables  with 
elements  of  the  Herbrand  universe  in  every  possible  way. 

An  interpretation  is  a  subset  of  HB.  An  interpretation  I  satisfies  the  ground 
rule  R  of  the  form  (1)  if  {Pi, . . . ,  Bm}  Q  I  and  {P^n+i,  •  •  • ,  Pn}  H  /  =  0  imply 
A  e  I  (written  as  7  |=  P).  In  particular,  I  satisfies  the  ground  integrity  con¬ 
straint  ^  Pi,  . . . ,  Bm,  not  Pm+i,  • .  • ,  not  Bn  if  either  (Pi, . . . ,  Bm}  \  7  ^  0  or 
{Ptti+i,  -  - . ,  Pn}  n  7  ^  0.  When  a  rule  R  contains  variables,  7  [=  P  means  that  7 
satisfies  every  ground  instance  of  P.  An  interpretation  which  satisfies  every  rule 
in  a  program  is  a  model  of  the  program.  A  model  A7  of  a  program  P  is  minimal 
if  there  is  no  model  N  of  P  such  that  N  C  M.  A  Horn  logic  program  has  at 
most  one  minimal  model  called  the  least  model 

For  the  semantics  of  NLPs,  we  consider  the  stable  model  semantics  [17]  in 
this  paper.  Given  a  program  P  and  an  interpretation  M,  the  ground  Horn  logic 
program  P^  is  defined  as  follows:  the  rule  A  Pi, . . . ,  Bm  is  in  P^  iff  there 
is  a  ground  rule  of  the  form  (1)  in  the  ground  instantiation  of  P  such  that 
{P^n+i, . . . ,  Pn}nM  —  0.  If  the  least  model  of  P^  is  identical  to  M,  M  is  called 
a  stable  model  of  P.  A  program  may  have  none,  one,  or  multiple  stable  models 
in  general.  A  program  having  exactly  one  stable  model  is  called  categorical  [3]. 
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A  stable  model  coincides  with  the  least  model  in  a  Horn  logic  program.  A  locally 
stratified  program  [36]  has  the  unique  stable  model  which  is  called  the  perfect 
model  Given  a  stable  model  iW,  we  define  M+  =  M  U  {  not  A  \  A  e  HB  \  M  }. 

A  program  is  consistent  (under  the  stable  model  semantics)  if  it  has  a  stable 
model;  otherwise  a  program  is  inconsistent.  Throughout  the  paper,  a  program 
is  assumed  to  be  consistent  unless  stated  otherwise.  If  every  stable  model  of  a 
program  P  satisfies  a  rule  R,  it  is  written  as  P  \=s  R-  Else  if  no  stable  model  of  a 
program  P  satisfies  a  rule  R,  it  is  written  as  P  |=s  not  R.  In  particular,  P  \=s  A 
if  a  ground  atom  A  is  true  in  every  stable  model  of  P;  and  P  \=^s  not  A  if  A  is 
false  in  every  stable  model  of  P.  By  contrast,  if  every  model  of  P  satisfies  P, 
it  is  written  as  P  \=  R.  Note  that  when  P  is  Horn,  the  meaning  of  ]:=  coincides 
with  the  classical  ent ailment. 


2.2  Inductive  Logic  Programming 

A  typical  ILP  problem  is  stated  as  follows.  Given  a  logic  program  B  represent¬ 
ing  background  knowledge  and  a  set  E'^  of  positive  examples  and  a  set  E~  of 
negative  examples,  find  hypotheses  H  satisfying^ 

1.  P  U  P  (=  e  for  every  e  G 

2.  P  U  P  ^  /  for  every  /  G  E~. 

3.  P  U  P  is  consistent. 

The  first  condition  is  called  completeness  with  respect  to  positive  examples, 
and  the  second  condition  is  called  (insistency  with  respect  to  negative  examples. 
It  is  also  implicitly  assumed  that  P  ^  e  for  some  e  G  P+  or  P  |=  /  for  some 
f  e  E~,  because  otherwise  there  is  no  need  to  introduce  P.  A  hypothesis  P 
covers  (resp.  uncovers)  an  example  e  if  P  U  P  |==  e  (resp.  P  U  P  ^  e). 

The  goal  of  ILP  is  then  to  develop  an  algorithm  which  efficiently  computes  hy¬ 
potheses  satisfying  the  above  three  conditions.  Induction  algorithms  are  roughly 
classified  into  two  categories  by  the  direction  of  searching  hypotheses.  A  top- 
down  algorithm  firstly  generates  a  most  general  hypothesis  and  refines  it  by 
means  of  specialization,  while  a  bottom-up  algorithm  searches  hypotheses  by 
generalizing  (positive)  examples.  Each  algorithm  locally  alternates  search  direc¬ 
tions  fi-om  general  to  specific  and  vice  versa  to  correct  hypotheses.  Algorithms 
presented  in  Sections  3. 1-3.3  of  this  paper  are  bottom- up  on  this  ground. 

An  induction  algorithm  is  correct  if  every  hypothesis  produced  by  the  algo¬ 
rithm  satisfies  the  above  three  conditions.  By  contrast,  an  induction  algorithm  is 
complete  if  it  produces  every  rule  satisfying  the  conditions.  Note  that  the  correct¬ 
ness  is  generally  requested  for  algorithms,  while  the  completeness  is  problematic 
in  practice.  For  instance,  consider  the  background  program  P  and  the  positive 
example  E  such  that 

P  :  r(/(x))  ^  r{x), 
q{a)  r{b)  ^  . 

_  R  ‘  P{a)- 

^  When  there  is  no  negative  example,  P"*"  is  just  written  as  E. 
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Then,  any  of  the  following  rules 

p(x)  ^  q(x), 
p{x)  <-  q{x),  r{b), 
p{x)  q{x),  r(/(6)), 


explains  p{a).  Generally,  there  exist  possibly  infinite  solutions  for  explaining  an 
example,  and  designing  a  complete  induction  algorithm  without  any  restriction 
is  of  little  value  in  practice.  In  order  to  extract  meaningful  hypotheses,  additional 
conditions  are  usually  imposed  on  possible  hypotheses  to  reduce  the  search  space. 
Such  a  condition  is  called  an  induction  bias  and  is  defined  as  any  information 
that  syntactically  or  semantically  influences  learning  processes. 

In  the  field  of  ILP,  most  studies  consider  a  Horn  logic  program  as  background 
knowledge  and  induce  Horn  clauses  as  hypotheses.  In  this  paper,  we  consider  an 
NLP  as  background  knowledge  and  induce  hypothetical  rules  possibly  containing 
NAF.  In  the  next  section,  we  give  several  algorithms  which  realize  this. 

3  Induction  in  Nonmonotonic  Logic  Programs 

3.1  Least  Generalization 

Generalization  is  a  basic  operation  to  perform  induction.  In  his  seminal  work  [34], 
Plotkin  introduces  generalization  in  clausal  theories  based  on  subsumption.  Given 
two  clauses  Ci  and  C2,  C\  6- subsumes  C2  ii  CiO  C  C2  for  some  substitution  B. 
Then,  C\  is  more  general  than  C2  under  6 -subsumption  if  Ci  ^-subsumes  C2- 
In  normal  logic  programs,  a  subsumption  relation  between  rules  is  defined  as 
follows. 

Definition  3.1.  (subsumption  relations  between  rules)  Let  Ri  and  R2  be  two 
rules.  Then,  R\  6-subsumes  R2  (written  as  Ri  ^.e  R2)  if  head{Ri)B  =  head{R2) 
and  body{Ri)0  C  body{R2)  hold  for  some  substitution  0.  In  this  case,  Ri  is  said 
more  general  than  R2  under  6 -subsumption. 

Thus  subsumption  is  defined  for  comparison  of  rules  with  the  same  predi¬ 
cate  in  the  heads.  The  same  definition  is  employed  by  Taylor  [43].  Fogel  and 
Zaverucha  [16]  discuss  the  effect  of  subsumption  to  reduce  the  search  space  in 
normal  logic  programs. 

For  generalization  in  clausal  theories,  least  generalizations  of  clauses  are  par¬ 
ticularly  important.  The  notion  is  defined  for  nonmonotonic  rules  as  follows. 

Definition  3.2.  (least  generalization  under  subsumption)  Let  7^  be  a  finite  set 
of  rules  such  that  every  rule  in  R  has  the  same  predicate  in  the  head.  Then,  a 
rule  is  a  least  generalization  of  R  under  ^-subsumption  if  R  he  Ri  ^ox  every 
rule  Ri  in  7^,  and  for  any  other  rule  R'  satisfying  R'  he  Ri  for  every  iTi  in  7^,  it 
holds  that  R'  he  R^ 
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In  the  clausal  language  every  finite  set  of  clauses  has  a  least  generalization. 
In  particular,  every  finite  set  of  Horn  clauses  has  a  least  generalization  as  a  Horn 
clause  [33,34].^  When  we  consider  normal  logic  programs,  rules  are  syntactically 
regarded  as  Horn  clauses  by  viewing  NAF-literal  notp{x)  as  an  atom  not.p{x) 
with  the  new  predicate  not-p.  Then  the  result  of  Horn  logic  programs  is  directly 
ceirried  over  to  normal  logic  programs. 

Theorem  3.1.  (existence  of  a  least  generalization)  LetlZ  be  a  finite  set  of  rules 
such  that  every  rule  in  'R,  has  the  same  predicate  in  the  head.  Then,  every  non¬ 
empty  set  RC-TZ  has  a  least  generalization  under  9 -subsumption. 

A  least  generalization  of  two  rules  is  computed  as  follows.  First,  a  least  gen¬ 
eralization  of  two  terms  fih, . . .  ,tn)  and  p(si, . . . ,  is  a  new  variable  v  if 
f  ^  g;  and  is  defined  as  f{lg{ti,si), . . .  ,lg{tn,Sn))  f  ~  g,  where  lg{ti,Si) 
means  a  least  generalization  of  U  and  Si.  Next,  a  least  generalization  of  two  LP- 
literals  Li  =  {not)p{ti, . . . ,  and  L2  —  {not)q{si, . . . ,  s„)  is  undefined  if  Li 
and  L2  do  not  have  the  same  predicate  and  sign;  otherwise,  it  is  defined  as 
lg{L^,  L2)  =  {not)p{lg{tuSi), . . . ,  lg{tn,  Sn)). 

Then,  a  least  generalization  of  two  rules  R\~Ai^  /i  and  R2  —  A2  ^  r2, 
where  Ai  and  A2  have  the  same  predicate,  is  obtained  as 

lg{Ai,A2)^r 

where  F  =  {^^(71, 72)  |  7i  €  Aj  72  €  A  and  ^5(71, 72)  is  defined}.  In  partic¬ 
ular,  if  A\  and  A2  are  empty,  a  least  generalization  of  two  integrity  constraints 
<—  A  and  •(—  A  is  given  by  <—  F.  A  least  generalization  of  a  finite  set  of  rules  is 
computed  by  repeatedly  applying  the  above  procedure. 

In  ILP  generalization  is  usually  considered  in  relation  to  the  background 
knowledge.  Plotkin  [35]  extends  subsumption  to  relative  subsumption  for  this  use. 
Given  the  background  knowledge  H  as  a  clausal  theory,  a  clause  C  subsumes  D 
relative  to  B  if  there  is  a  substitution  0  such  that  B  |=  \/{C9  — >  D). 

We  apply  relative  subsumption  to  normal  logic  programs.  Let  R  =  H  ^  A,  F 
be  a  rule  where  A  is  an  atom  and  F  is  a  conjunction.  Suppose  that  there  is  a 
rule  A'  r'  in  a  program  P  such  that  A$  =  A'O  for  some  substitution  6.  Then, 
we  say  that  the  rule  {H  <—  F' ,  F)0  is  obtained  by  unfolding  R  in  P.  We  also  say 
that  Rk  is  obtained  by  unfolding  A)  in  P  if  there  is  a  sequence  Ro,...,Rk  of 
rules  such  that  Ri  {1  <  i  <  k)  is  obtained  by  unfolding  A-i  in  P. 

Definition  3.3.  (relative  subsumption)  Let  P  be  an  NLP,  and  Pi  and  P2  be 
two  rules.  Then,  Pi  0- subsumes  P2  relative  to  P  (written  as  Pi  >Z0  P2)  if  there 
is  a  rule  P  that  is  obtained  by  unfolding  Pi  in  P  and  P  ^-subsumes  P2.  In  this 
case.  Pi  is  said  more  general  than  P2  relative  to  P  under  9 -subsumption. 

The  above  definition  reduces  to  Definition  3.1  when  P  is  empty.  By  the 
definition  relative  subsumption  is  also  defined  for  two  rules  having  the  same 

^  If  two  clauses  have  no  predicate  with  the  same  sign  in  common,  the  empty  clause 
becomes  the  least  generalization. 
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predicate  in  the  heads.  In  clausal  theories,  Buntine  [7]  introduces  generalized 
subsumption  which  is  defined  between  definite  clauses  having  the  same  predicate 
in  the  heads.  Comparing  two  definitions,  Buntine’s  definition  is  model  theoretic, 
while  our  definition  is  operational.  Taylor  [43]  introduces  normal  subsumption 
which  extends  Buntine’s  subsumption  to  normal  logic  programs  and  is  defined 
in  a  model  theoretic  manner. 

Example  3.1.  Suppose  the  background  program  P,  and  two  rules  Pi  and  P2  as 
follows. 


P  :  hasjwing{x)  bird{x)^  notah{x)^ 
bird{x)  sparrow{x)^ 
ab{x)  broken~wing{x). 

Pi  :  flies{x)  has-wing{x). 

P2  :  flies(x)  sparrow{x),  full. grown{x),  not  ab{x). 

From  P  and  Pi,  the  rule 

P3  :  flies{x)  sparrow{x)^notab{x) 

is  obtained  by  unfolding.  As  P3  ^-subsumes  P2,  Pi  he  P2- 

In  clausal  theories,  a  least  generalization  does  not  always  exist  under  relative 
subsumption.  However,  when  background  knowledge  is  a  finite  set  of  ground 
atoms,  a  least  generalization  of  two  clauses  is  constructed  [33,35].  The  result 
is  extended  to  nonmonotonic  rules  and  is  rephrased  in  our  context  as  follows. 
Let  P  be  a  finite  set  of  ground  atoms,  and  Pi  and  P2  be  two  rules.  Then,  a 
least  generalization  of  these  rules  under  relative  subsumption  is  constructed  as 
a  least  generalization  of  P^  and  P2  where  head{R'^)  ~  head{Ri)  and  body{R'f)  = 
body{Ri)  U  B. 

Example  3.2.  Suppose  the  background  program  P,  and  two  (positive)  exam¬ 
ples  Pi  and  P2  as  follows. 

P  :  bird(tweety)  ,  bird{polly)  . 

Pi  :  flies{tweety)  ^  has  .wing  (tweety)^  notab{tweety). 

P2  :  flies{polly)  sparrow{polly)^  notab(polly). 

Then,  Pj  and  P2  becomes 

P^  :  flies{tweety)  bird(tweety),  bird{polly),  has-wing{tweety)^ 

not  ah{tweety), 

P2  :  flies{polly)  bird{tweety)^  bird(polly)^  sparrow{polly),  notab{polly). 
The  least  generalization  of  P^  and  P2  is 

flies{x)  bird{tweety),  bird{polly),  bird{x),  notab{x). 
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Removing  redundant  literals,  it  becomes 

R  :  flies{x)  bird{x),  notab{x). 

In  this  case,  it  holds  that  P  U  {R}  |=s  Ri  {i  =  1, 2). 

3.2  Inverse  Resolution 

Inverse  resolution  [29]  is  based  on  the  idea  of  inverting  the  resolution  step  be¬ 
tween  clauses.  There  are  two  operators  that  carry  out  inverse  resolution,  ab¬ 
sorption  and  identification,  which  are  called  the  V- operators  together.  Each 
operator  builds  one  of  the  two  parent  clauses  given  the  other  parent  clause 
and  the  resolvent.  Suppose  two  rules  Ri  :  Bi  Pi  and  R2  :  A2  B2,r2. 
When  Bi9\  =  ^2^2)  the  rule  R3  :  ^2^2  ^  ^2^2  is  produced  by  unfold¬ 

ing  R2  with  Ri,  Absorption  constructs  R2  from  Ri  and  R3,  while  identification 
constructs  Ri  from  R2  and  R3  (see  figure). 


Ri  :Bi  R2:A2^B2,r2 


Given  a  normal  logic  program  P  containing  the  rules  Ri  and  R3,  absorption 
produces  the  program  A{P)  such  that 

AiP)  =  {P\{R,})u{R2}. 

On  the  other  hand,  given  an  NLP  P  containing  the  rules  R2  and  R3,  identification 
produces  the  program  I{P)  such  that 

I{P)  =  (P\{R3})U{R^}. 

Note  that  there  are  multiple  A[P)  or  I{P)  exist  in  general  according  to  the 
choice  of  the  input  rules  in  P.  We  write  V{P)  to  mean  either  A(P)  or  I{P). 

When  P  is  a  Horn  logic  program,  any  information  implied  by  P  is  also  implied 
by  V{P),  namely 

V{P)\=P. 

In  this  regard,  the  V-operators  generalize  a  Horn  logic  program.  In  the  presence 
of  negation  as  failure  in  a  program,  however,  the  V-operators  do  not  work  as 
generalization  operations  in  general. 
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Example  3.3.  Let  P  be  the  program: 

p{x)  4-  notq(x),  q{x)  ^  t{x),  s{x)  ^  r{x),  s{a) 

which  has  the  stable  model  {  p(a),  s(a)  }.  Absorbing  the  third  rule  into  the  second 
rule  produces  A{P): 

p{x)  notq{x),  q{x)  ^  s{x),  s{x)  r(a;),  s[a)  , 

which  has  the  stable  model  {  g(a),  s(a)  }.  Then,  P  |=s  p{a)  but  A{P)  p{a). 

A  counter-example  for  identification  is  constructed  in  a  similar  manner.  The 
reason  is  clear,  since  in  nonmonotonic  logic  programs  newly  proven  facts  may 
block  the  derivation  of  other  facts  which  are  proven  beforehand.  As  a  result,  the 
V-operators  may  not  generalize  the  original  program.  Moreover,  the  next  exam¬ 
ple  shows  that  the  V-operators  often  make  a  consistent  program  inconsistent. 

Example  3.4-  Let  P  be  the  program: 

p{x)  ^  q{x),  notp{x),  q{x)<r~r(x),  s{x)  <r- r{x),  s(a) 

which  has  the  stable  model  {  s(a) }.  Absorbing  the  third  rule  into  the  second 
rule  produces  A{P): 

p(x)  ^  q{x),  notp(x),  q{x)  <-  s(a;),  s{x)  i-  r{x),  s{a) 
which  has  no  stable  model. 

The  above  example  shows  that  the  V-operators  have  destructive  effect  on 
the  meaning  of  programs  in  general.  It  is  also  known  that  they  may  destroy  the 
syntactic  structure  of  programs  such  as  acyclicity  and  local  stratification  [37]. 

These  observations  give  us  a  caution  to  apply  the  V-operators  to  NMLP.  A 
condition  for  the  V-operators  to  generalize  an  NLP  is  as  follows. 

Theorem  3.2.  (conditions  for  the  V-operators  to  generalize  programs)  [37]  Let 
P  be  an  NLP,  and  Ri,  R2,  R3  be  rules  at  the  beginning  of  this  section.  For  any 
NAF-literal  not  L  in  P,^ 

(i)  if  L  does  not  depend  on  the  head  of  R3  in  P,  then  P  \=s  N  implies  A{P)  |=s 
N  for  any  N  E  TiB. 

(ii)  if  L  does  not  depend  on  the  atom  B2  of  R2  in  P,  then  P  \=  3  N  implies 
I{P)  |=,  N  for  any  N  E  UB. 

^  Here,  depends  on  is  a  transitive  relation  defined  as:  A  depends  on  B  if  there  is  a 
ground  rule  from  P  s.t.  A  appears  in  the  head  and  B  appears  in  the  body  of  the 
rule. 


72 


Chiaki  Sakama 


Example  3.5.  Suppose  the  background  program  P  and  a  (positive)  example  E 
as  follows. 


P  :  flies(x)  sparrow{x),  notab{x)^ 
bird(x)  sparrow{x), 
sparrow{tweety)  ,  bird{polly)  <r-  . 

E  :  flies(polly). 

Initially,  P  [=5  flies{tweety)  but  P  ^3  flies{polly).  Absorbing  the  second  rule 
into  the  first  rule  in  P  produces  the  program  A{P)  in  which  the  first  rule  of  P 
is  replaced  by  the  next  rule  in  A{P): 

flies{x)  bird{x),  notab{x). 

Then,  A{P)  \=s  flies{polly).  Notice  that  A{P)  |=s  flies{tweety)  also  holds. 

Taylor  [43]  introduces  a  different  operator  called  normal  absorption^  which 
generalizes  normal  logic  programs. 


3.3  Inverse  Entailment 
Suppose  an  induction  problem 


BU{H}^E 

where  B  is  a  Horn  logic  program  and  H  and  E  are  each  single  Horn  clauses. 
Inverse  entailment  (IE)  [31]  is  based  on  the  idea  that  a  possible  hypothesis  H  is 
deductively  constructed  from  B  and  E  by  inverting  the  entailment  relation  as 

B  U  {-nE}  1=  -^H. 

When  a  background  theory  is  a  nonmonotonic  logic  program,  however,  the  IE 
technique  cannot  be  used.  This  is  because  IE  is  based  on  the  deduction  theorem 
in  first-order  logic,  but  it  is  known  that  the  deduction  theorem  does  not  hold  in 
nonmonotonic  logics  in  general  [41]. 

To  solve  the  problem,  Sakama  [38]  introduced  the  entailment  theorem  in 
normal  logic  programs.  A  nested  rule  is  defined  as 

A^R 

where  A  is  an  atom  and  i?  is  a  rule  of  the  form  (1).  An  interpretation  I  satisfies  a 
ground  nested  rule  A  <-  if  /  f=  i?  implies  A  €  /.  For  an  NLP  P,  P  (=3  (A  R) 
if  A  <—  P  is  satisfied  in  every  stable  model  of  P. 

Theorem  3.3.  (entailment  theorem  [38])  Let  P  be  an  NLP  and  R  a  rule  such 
that  P  U  {P}  is  consistent.  For  any  ground  atom  A,  P  U  {P}  (=3  A  implies 
P\=s  A<^  R.  In  converse,  P\=s  A  <-  R  and  P  \^s  R  imply  P  U  {P}  \=s  A. 
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The  entailment  theorem  corresponds  to  the  deduction  theorem  and  is  used 
for  inverting  entailment  in  normal  logic  programs. 

Theorem  3.4.  (IE  in  normal  logic  programs  [38])  Let  P  be  an  NLP  and  R  a 
rule  such  that  P[J{R}  is  consistent.  For  any  ground  LP-literal  L,  if  PU{R}  |=5  L 
and  P  [=s  L,  then  P  |=5  not  R. 

Thus,  the  relation 

P  [=5  not  R  (2) 

provides  a  necessary  condition  for  computing  a  rule  R  satisfying  PU{R}  \=^3  L 
and  P\=s<^  L.  When  L  is  an  atom  (resp.  NAF-literal),  it  represents  a  positive 
(resp.  negative)  example.  The  condition  P  |=s  L  states  that  the  example  L  is 
initially  false  in  every  stable  model  of  P.  To  simplify  the  problem,  a  program  P 
is  assumed  to  be  function-free  and  categorical  in  the  rest  of  this  section. 

Given  two  ground  LP-literals  Li  and  L2,  the  relation  Li  L2  is  defined  if 
pred{Li)  =  pred{L2)  with  a  predicate  of  arity  >  1  and  const{Li)  =  const{L2). 
Let  jL  be  a  ground  LP-literal  and  S  a  set  of  ground  LP-literals.  Then,  Li  in  S  is 
relevant  to  L  if  either  (i)  Li  ^  Lot  (ii)  Li  shares  a  constant  with  an  LP-literal  L2 
in  S  such  that  L2  is  relevant  to  L. 

Let  P  be  a  program  with  the  unique  stable  model  M  and  A  a  ground  atom 
representing  a  positive  example.  Suppose  that  the  relation  P  U  {P}  t=s  A  and 
P  1=5  A  hold.  By  Theorem  3.4,  the  relation  (2)  holds,  thereby 

M^R.  (3) 

Then,  we  start  to  find  a  rule  R  satisfying  the  condition  (3).  Consider  the  integrity 
constraint  <r-  P  where  P  consists  of  ground  LP-literals  in  which  are  relevant 
to  the  positive  example  A.^  Since  M  does  not  satisfy  this  integrity  constraint, 

M  P  (4) 

holds.  That  is,  P  is  a  rule  which  satisfies  the  condition  (3). 

Next,  by  P  [=5  A,  it  holds  that  A  ^  M,  thereby  not  A  €  M'^.  Since  not  A 
is  relevant  to  A,  the  integrity  constraint  P  contains  not  A  in  its  body.  Then, 
shifting  the  atom  A  to  the  head  produces 

A  ^  P'  (5) 


where  P'  =  P  \  {not  A}. 

Finally,  the  rule  (5)  is  generalized  by  constructing  a  rule  R*  such  that  R*0  — 
A  r'  for  some  substitution  0.  It  is  verified  that  the  rule  R*  satisfies  the 
condition  (2),  i.e.,  P  \=s  notR*. 

The  next  theorem  presents  a  sufficient  condition  for  the  correctness  of  R*  to 
induce  A. 


®  Since  P  is  function-free,  P  consists  of  finite  LP-literals. 
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Theorem  3.5.  (correctness  of  the  IE  rule  [39])  Let  P  be  a  function-free  and 
categorical  NLP,  A  a  ground  atom,  and  R*  a  rule  obtained  as  above.  If  P\J{R*} 
is  consistent  and  pred{A)  does  not  appear  in  P,  then  P  U  {R*}  \=s  A. 

Example  3.6.  Let  P  be  the  program 

bird{x)  penguin{x)y 
bird{tweety)  ■«— ,  penguin{polly)  . 

Given  the  example  L  =  flies{tweety),  it  holds  that  P  flies{tweety).  Our 
goal  is  then  to  construct  a  rule  R  satisfying  P  U  {R}  |=s  L. 

First,  the  set  of  LP-literals  becomes 

M'^={  bird{tweety),  bird{polly),  penguin{polly) , 

notpenguin{tweety),  not  flies{tweety),  not  flies(polly)}. 

Prom  M'^  picking  up  LP-literals  which  are  relevant  to  L,  the  integrity  constraint: 

bird{tweety)y  not penguin{tweety) ,  not  flies{tweety) 

is  constructed.  Next,  shifting  flies(tweety)  to  the  head  produces 

flies{tweety)  ^  bird{tweety),  not penguin{tweety) . 

Finally,  replacing  tweety  by  a  variable  x,  the  rule 

R*  :  flies{x)  ^  bird{x),  not penguin{x) 

is  obtained,  where  P  U  {R*}  L  holds. 

The  inverse  ent ailment  algorithm  is  also  used  for  learning  programs  by  neg¬ 
ative  examples  [38]. 


3.4  Other  Techniques 

This  section  reviews  other  techniques  for  learning  nonmonotonic  logic  programs. 

Bain  and  Muggleton  [2]  introduce  the  algorithm  called  Closed  World  Spe¬ 
cialization  (CWS).  In  the  algorithm,  an  initial  program  and  an  intended  inter¬ 
pretation  that  a  learned  program  should  satisfy  are  given.  In  this  setting,  any 
atom  which  is  not  included  in  the  interpretation  is  considered  false.  For  instance, 
suppose  the  program: 


P  :  flies{x)  bird{x), 

bird{eagle)  bird{emu) 

and  the  intended  interpretation: 

M  :  {  flies{eagle),  bird{eagle),  bird{emu)}, 
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where  flies{emu)  is  not  in  M  and  is  interpreted  false.  As  P  implies  flies{emu), 
the  CWS  algorithm  specializes  P  and  produces 

flies{x)  <—  bird{x),  notab{x), 
bird{eagle)  bird{emu)  ,  ab{emu)  •<—  . 

Here,  ab{x)  is  a  newly  introduced  atom.®  In  this  algorithm  NAF  is  used  for 
specializing  Horn  clauses  and  the  CWS  produces  normal  logic  programs. 

Inoue  and  Kudoh  [19]  propose  an  algorithm  called  LELP  which  learns  ex¬ 
tended  logic  programs  (ELP)  under  the  answer  set  semantics.  The  algorithm 
is  close  to  Bain  and  Muggleton’s  method  but  is  different  from  it  on  the  point 
that  [19]  uses  Open  World  Specialization  (OWS)  rather  than  the  CWS  under  the 
3- valued  setting.  The  OWS  does  not  use  the  closed  world  assumption  to  identify 
negative  instances  of  the  target  concept. 

Given  positive  and  negative  examples,  LELP  firstly  constructs  (monotonic) 
rules  that  cover  positive  examples  by  using  an  ordinary  ILP  algorithm,*^  then  gen¬ 
erates  default  rules  to  uncover  negative  examples  by  incorporating  NAF  literals 
to  the  bodies  of  rules.  In  addition,  exceptions  to  rules  are  identified  from  neg¬ 
ative  examples  and  are  then  generalized  to  default  cancellation  rules.  In  LELP, 
hierarchical  defaults  can  be  learned  by  recursively  calling  the  exception  identifi¬ 
cation  algorithm.  Moreover,  when  some  instances  are  possibly  classified  as  both 
positive  and  negative,  nondeterministic  rules  can  also  be  learned  so  that  there 
are  multiple  answer  sets  for  the  resulting  program.  Lamma  etal.  [22]  formalize 
the  same  problem  under  the  well-founded  semantics.  In  their  algorithms,  differ¬ 
ent  levels  of  generalization  are  strategically  combined  in  order  to  learn  solutions 
for  positive  and  negative  concepts. 

Dimopoulos  and  Kakas  [12]  construct  default  rules  with  exceptions.  For  in¬ 
stance,  suppose  the  background  program: 

P  :  bird{x)  penguin{x), 

penguin{x)  super jpenguin{x), 

bird{a)  ,  bird{b)  , 
penguin{c)  super jpenguin{d)  , 

and  the  positive  and  negative  examples: 

:  /Hes(a),  flies{b),  flies{d). 

:  flies{c). 

Their  algorithm  first  computes  a  rule  which  covers  all  the  positive  examples: 

ri  :  flies(x)  •«—  bird{x) . 

®  Such  an  atom  is  called  invented. 

^  An  “Ordinary  ILP”  means  any  top-down/bottom-up  ILP  algorithm  which  is  used 
in  clausal  logic. 
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This  rule  also  covers  the  negative  example,  then  the  algorithm  next  computes  a 
rule  which  explains  the  negative  example: 

T2  *  ->flies{x)  <r~  penguin{x) . 

In  order  to  avoid  drawing  contradictory  conclusions  on  c,  the  rule  r2  is  given 
priority  over  ri.  Likewise,  the  algorithm  next  computes  the  rule 

rs  :  flies{x)  super jpenguin(x) 

and  rs  is  given  priority  over  r2.  A  unique  feature  of  their  algorithm  is  that 
they  learn  rules  using  an  ordinary  ILP  algorithm,  and  represent  exceptions  by  a 
prioritized  hierarchy  without  using  NAF. 

Sakama  [39]  presents  a  method  of  computing  inductive  hypotheses  using  an¬ 
swer  sets  of  extended  logic  programs.  Given  an  ELP  P  and  a  ground  literal  L, 
suppose  a  rule  R  satisfying  P  U  {i?}  [=>15:  L,  where  \=as  is  an  ent ailment  under 
the  answer  set  semantics.  It  is  shown  that  this  relation  together  with  P  L 
implies  P  ^as  R-  This  provides  a  necessary  condition  for  any  possible  hypoth¬ 
esis  R  which  explains  L.  A  candidate  hypothesis  is  then  obtained  by  computing 
answer  sets  of  P ,  and  constructing  a  rule  which  is  unsatisfied  in  an  answer  set. 
The  method  provides  the  same  result  as  [38]  in  a  much  simpler  manner.  In 
function-fi'ee  stratified  programs  the  algorithm  constructs  inductive  hypotheses 
in  polynomial-time. 

Bergadano  et  al.  [4]  propose  the  system  called  TRACY'^^^  which  learns  NLPs 
using  the  derivation  information  of  examples.  In  this  system  candidate  hypothe¬ 
ses  are  given  in  input  to  the  system,  and  from  those  candidates  the  system 
selects  hypotheses  which  cover /uncover  positive/negative  examples.  Martin  and 
Vrain  [25]  introduce  an  algorithm  to  learn  NLPs  under  the  3-valued  semantics. 
Given  a  3- valued  model  of  a  background  program,  it  constructs  (possibly  recur¬ 
sive)  rules  to  explain  examples.  Seitzer  [40]  proposes  a  system  called  INDED.  It 
consists  of  a  deductive  engine  which  computes  stable  models  or  the  well-founded 
model  of  a  background  NLP,  and  an  inductive  engine  which  induces  hypotheses 
using  the  computed  models  and  positive/negative  examples.  It  can  learn  un¬ 
stratified  programs.  Fogel  and  Zaverucha  [16]  propose  an  algorithm  for  learning 
strict  and  call-consistent  NLPs,  which  effectively  searches  the  hypotheses  space 
using  subsumption  and  iteratively  constructed  training  examples. 

Finally,  the  algorithms  presented  in  this  paper  are  summarized  in  Table  1. 

For  related  research,  learning  ahductive  logic  programs  [13,20,21,23]  and 
learning  action  theories  [24]  are  important  applications  of  NMILP. 

4  Summary  and  Open  Issues 

We  presented  an  overview  of  techniques  for  realizing  induction  in  nonmonotonic 
logic  programs.  Techniques  in  ILP  have  been  centered  on  clausal  logic  so  far, 
especially  on  Horn  logic.  However,  as  nonmonotonic  logic  programs  are  different 
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Table  1.  Comparison  of  Algorithms 


Learned  Programs 

Algorithms 

References 

NLP 

Ordinary  ILP  -1-  specialization 

[2 

Selection  from  candidates 

4 

Top-down 

[16,25,40] 

Inverse  resolution 

[37,43] 

Inverse  entailment 

[38] 

Least  generalization 

Section  3.1 

ELP 

Ordinary  ILP 

[12] 

Ordinary  ILP  -|-  specialization 

[19,22] 

Computing  Answer  Sets 

[39] 

from  classical  logic,  existing  techniques  are  not  directly  applicable  to  nonmono¬ 
tonic  situations.  In  contrast  to  clausal  ILP,  the  field  of  nonmonotonic  ILP  is  less 
explored  and  several  issues  remain  open.  Such  issues  include: 

-  Generalization  under  implication:  In  Section  3.1,  we  introduced  the  sub¬ 
sumption  order  between  rules  and  provided  an  algorithm  of  computing  a  least 
generalization,  which  is  an  easy  extension  of  the  one  in  clausal  logic.  On  the  other 
hand,  in  clausal  theories  there  is  another  generalization  based  on  the  implica¬ 
tion  order  which  uses  the  entailment  relation  Ci  \=  C2  between  two  clauses  Ci 
and  C2.  Concerning  generalizations  under  implication  in  NMLP,  however,  the 
result  of  clausal  logic  is  not  directly  applicable  to  NMLP.  This  is  because  the 
entailment  relation  in  NMLP  is  considered  under  the  commonsense  semantics, 
which  is  different  from  the  classical  entailment  relation.  For  instance,  under  the 
stable  model  semantics,  the  relation  1=^  is  used  instead  of  |=:.  Generality  rela¬ 
tions  under  implication  would  have  properties  different  from  the  subsumption 
order,  and  the  existence  of  le2Lst  generalizations  and  their  computability  are  to 
be  examined. 

-  Generalization  operations  in  nonmonotonic  logic  programs:  In  clausal  the¬ 
ories,  operations  by  inverting  resolution  generalize  programs,  but  as  presented 
in  Section  3.2,  they  do  not  generalize  programs  in  nonmonotonic  situations  in 
general.  Then,  it  is  important  to  develop  program  transformations  which  gen¬ 
eralize  nonmonotonic  logic  programs  (under  particular  semantics)  in  general. 
Such  transformations  would  serve  as  fundamental  operations  in  nonmonotonic 
ILP.  An  example  of  this  kind  of  transformations  is  seen  in  [43]. 

-  Relations  between  induction  and  other  commonsense  reasoning:  Induc¬ 
tion  is  a  kind  of  nonmonotonic  inference,  hence  theoretical  relations  between 
induction  and  other  nonmonotonic  formalisms,  including  nonmonotonic  logic 
programming,  are  of  interest.  Such  relations  will  enable  us  to  implement  ILP  in 
terms  of  NMLP,  and  also  open  possibilities  to  integrate  induction  and  common- 
sense  reasoning.  Researches  in  this  direction  are  found  in  [1,14]. 
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Ten  years  have  passed  since  the  first  LPNMR  conference  was  held  in  1991. 

In  [32]  the  preface  says: 

...  there  has  been  growing  interest  in  the  relationship  between  logic  pro¬ 
gramming  semantics  and  non-monotonic  reasoning.  It  is  now  reasonably 
clear  that  there  is  ample  scope  for  each  of  these  areas  to  contribute  to 
the  other. 

As  a  concluding  remark,  we  rephrase  the  same  sentence  between  NMLP  and 

ILP.  Combining  NMLP  and  ILP  in  the  framework  of  nonmonotonic  inductive 

logic  programming  is  an  important  step  towards  a  better  knowledge  representa¬ 
tion  tool,  and  will  bring  fruitful  advance  in  each  field. 
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Abstract.  Logic  programs  P  and  Q  are  strongly  equivalent  if,  given 
any  logic  program  i?,  programs  P  \J  R  and  Q\J  R  are  equivalent  (that 
is,  have  the  same  answer  sets).  Strong  equivalence  is  convenient  for  the 
study  of  equivalent  transformations  of  logic  programs:  one  can  prove 
that  a  local  change  is  correct  without  considering  the  whole  program. 
Recently,  Lifschitz,  Pearce  and  Valverde  showed  that  Heyting’s  logic  of 
here-and-there  can  be  used  to  characterize  strong  equivalence  of  logic 
programs.  This  paper  offers  a  more  direct  characterization,  and  extends 
it  to  default  logic.  In  their  paper,  Lifschitz,  Pearce  and  Valverde  study  a 
very  general  form  of  logic  programs,  called  “nested”  programs.  For  the 
study  of  strong  equivalence  of  default  theories,  it  is  convenient  to  intro¬ 
duce  a  corresponding  “nested”  version  of  default  logic,  which  generalizes 
Reiter’s  default  logic. 


1  Introduction 


Logic  programs  P  and  Q  are  “strongly  equivalent”  if,  given  any  logic  pro¬ 
gram  R,  PU  R  and  QU  R  are  equivalent.  Recent  work  by  Lifschitz,  Pearce 
and  Valverde  [4]  uses  Heyting’s  logic  of  here-and-there  to  characterize  strong 
equivalence  of  logic  programs  under  the  answer  set  semantics  [1,3].  Their  proof 
utilizes  Pearce’s  equilibrium  logic  [5,6].  In  the  current  paper,  strong  equivalence 
of  logic  programs  is  characterized  more  directly,  in  terms  of  concepts  used  in 
the  definition  of  answer  sets — no  knowledge  of  the  logic  of  here-and-there  is  re¬ 
quired.  This  simplifies  the  proof  of  the  main  strong  equivalence  theorem,  and 
may  also  make  the  result  easier  to  apply  to  specific  cases.  Moreover,  this  alterna¬ 
tive  characterization  of  strong  equivalence  is  easily  extended  to  Rieter’s  default 
logic  [7]. 

Strong  equivalence  can  help  us  reason  about  correctness  of  logic  programs 
and  default  theories.  For  example,  as  discussed  in  [4],  it  can  be  used  to  establish 
the  fact  that  in  any  logic  program  with  a  constraint  of  the  form 


the  disjunctive  rule 


F]G^J 
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can  be  replaced  by  the  pair  of  rules 

F  ^  not  G 
G  not  F 

without  affecting  the  program’s  answer  sets. 

Lifschitz,  Pearce  and  Valverde  consider  strong  equivalence  for  a  very  general 
form  of  logic  programs,  called  “nested”  programs,  introduced  by  Lifschitz,  Tang 
and  Turner  [3].  For  the  study  of  strong  equivalence  of  default  theories,  it  is 
convenient  to  introduce  similarly  general  “nested”  default  theories. 

Section  2  reviews  definitions  for  nested  logic  programming.  Section  3  states 
and  proves  a  simple  characterization  of  strong  equivalence  for  logic  programs. 
Section  4  makes  precise  the  relationship  between  our  strong  equivalence  theorem 
and  that  obtained  using  the  logic  of  here-and-there.  Section  5  briefly  investigates 
strongly  equivalent  transformations  of  logic  programs.  Taking  advantage  of  the 
strong  similarities  between  definitions  for  logic  programming  and  default  logic. 
Section  6  introduces  “nested”  default  logic,  and  shows  that  it  extends  both 
nested  logic  programming  and  disjunctive  default  logic  [2],  which  in  turn  extends 
Reiter’s  default  logic.  Section  7  states  a  characterization  of  strong  equivalence 
for  nested  default  theories  similar  to  that  for  nested  logic  programs.  Section  8 
briefly  investigates  strongly  equivalent  transformations  of  default  theories.  Proofs 
related  to  nested  default  logic  appear  in  Section  9. 

2  Nested  Logic  Programming 

This  paper  employs  the  definition  of  logic  programs  from  [3],  although  the  pre¬ 
sentation  differs  in  some  details. 


2.1  Syntax 

The  words  atom  and  literal  are  understood  here  as  in  propositional  logic.  El¬ 
ementary  formulas  are  literals  and  the  0-place  connectives  J_  (“false”)  and 
T  (“true”).  NLP  formulas  are  built  from  elementary  formulas  using  the  unary 
connective  not  and  the  binary  connectives  ,  (conjunction)  and  ;  (disjunction). 
An  NLP  rule  is  an  expression  of  the  form 

F^G 

where  F  and  G  are  NLP  formulas,  called  the  head  and  the  body  of  the  rule. 

A  nested  logic  program  is  a  set  of  NLP  rules. 

When  convenient,  a  rule  F  <—  T  is  identified  with  the  formula  F. 


2.2  Semantics 

Let  us  first  define  recursively  when  a  consistent  set  X  of  literals  satisfies  an  NLP 
formula  F  (symbolically,  A  |=  F),  as  follows. 
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-  For  elementary  FyX\=Fif[FeXoTF  =  T. 

-  X^{F,G)  iffX  (=  J^andX  1=G. 

-  X[=  {F]G)  iEX^FoTX\=G. 

~  X\==notFiSX^F. 

A  consistent  set  X  of  literals  is  closed  under  a  program  P  if,  for  every  rule 
F  ^G  in  P,X}=F  whenever  X  ^  G. 

The  reduct  of  a  formula  F  relative  to  a  consistent  set  X  of  literals  (writ¬ 
ten  F^)  is  obtained  by  replacing  every  maximal  occurrence  in  F  of  a  formula 
of  the  form  not  G  with  ±  if  X  \=  G  and  with  T  otherwise.  The  reduct  of  a 
program  P  relative  to  X  (written  P^)  is  obtained  by  replacing  the  head  and 
body  of  each  rule  in  P  by  their  reducts  relative  to  X. 

A  consistent  set  X  of  literals  is  an  answer  set  for  P  if  it  is  minimal  among 
the  consistent  sets  of  literals  closed  under  P^ , 

As  discussed  in  [3],  this  definition  agrees  with  previous  versions  of  the  answer 
set  semantics  on  consistent  answer  sets  (but  does  not  allow  for  an  inconsistent 
one). 

3  Strong  Equivalence  of  Logic  Programs 

Logic  programs  P  and  Q  are  equivalent  if  they  have  the  same  answer  sets.  They 
are  strongly  equivalent  if,  for  any  logic  program  i2,  PUR  and  QUR  are  equivalent. 

The  following  terminology  is  convenient.  For  program  P,  and  consistent 
sets  X,  Y  of  literals  with  X  CY^  call  the  pair  (X,  y )  an  S E-model  of  P  if  both  X 
and  y  are  closed  under  P^. 

In  Section  4,  we  will  see  that  SE-models  correspond  to  models  in  the  logic  of 
here-and-there. 

Theorem  1.  Logic  programs  are  strongly  equivalent  iff  they  have  the  same 
SE-models. 

Proof.  Right  to  left:  Assume  that  programs  P  and  Q  have  the  same  SE-models. 
Take  any  program  R.  We  need  to  show  that  PUP  and  Q  U  P  are  equivalent. 
Assume  that  X  is  an  answer  set  for  P  U  P.  That  is,  X  is  a  consistent  set  of  literals 
closed  under  (PUP)"^,  and  no  proper  subset  of  X  is  closed  under  (PUP)^. 
Since  (PUP)^  =  P^  UP^,  X  is  closed  under  both  P^  and  P^.  Since  X  is 
closed  under  P^,  it  follows  by  assumption  that  X  is  closed  under  .  So  X 
is  closed  under  U  P^  =  (Q  U  P)^.  Suppose  a  proper  subset  of  X  is  closed 
under  {Q  U  P)^.  Then  it  is  closed  under  both  and  P^,  By  assumption  it  is 
also  closed  under  P^,  and  thus  under  (P  U  P)^,  contradicting  the  choice  of  X. 
We  conclude  that  every  answer  set  for  PUP  is  an  answer  set  for  Q  U  P.  By 
symmetry,  every  answer  set  for  Q  U  P  is  an  answer  set  for  P  U  P. 

Left  to  right:  Assume  (wiog)  that  (X,  y)  is  an  SE-model  of  program  P  but 
not  of  program  Q.  We  need  to  show  that  P  and  Q  are  not  strongly  equivalent. 
Consider  two  cases. 
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Case  1:  y  is  not  closed  under  Then  Y  is  not  closed  under  (Q  U  Y)^  == 
U  Y,  and  so  is  not  an  answer  set  for  QUY.  On  the  other  hand,  one  easily 
verifies  that  Y  is  an  answer  set  for  PUY.  Hence  P  and  Q  are  not  strongly 
equivalent. 

Case  2:  Y  is  closed  under  .  Take  R  =  X  U  {F  i—  G  :  F^G  £Y  \  X}, 
Clearly  Y  is  closed  under  (Q  U  i?)^.  Let  Z  be  a  subset  of  Y  closed  un¬ 
der  (QuR)^  —  U  R.  By  choice  oi  R,  X  C  Z,  and  by  assumption  X  is  not 
closed  under  so  X  ^  Z,  Hence  some  L  £Y\X  belongs  to  Z.  It  follows  by 
choice  of  R  that  Y  \  X  C  Z.  Consequently  Z  =  Y,  and  so  Y  is  an  answer  set 
for  Q  UR.  On  the  other  hand,  X  is  a  proper  subset  of  Y  that  is  closed  under 
(PUR)^  =P^ UR.  Soy  is  not  an  answer  set  for  PU  R^  and  we  conclude  again 
that  P  and  Q  are  not  strongly  equivalent.  □ 

Although  simpler  (due  to  simpler  definitions),  this  proof  resembles  in  many 
details  the  proof  of  the  main  theorem  in  [4],  including  the  fact  that  it  demon¬ 
strates  that  if  logic  programs  P  and  Q  are  not  strongly  equivalent  then  they  can 
be  distinguished  by  adding  a  logic  program  in  which  the  head  of  each  rule  is  a 
literal  and  the  body  of  each  rule  is  either  a  literal  or  T. 

4  HT-Models  and  the  Logic  of  Here-and-There 

Lifschitz,  Pearce  and  Valverde  identify  logic  program  rules  with  formulas  in  the 
logic  of  here-and- there,  and  show  that  programs  are  strongly  equivalent  iff  they 
are  equivalent  in  the  logic  of  here-and-there. 

They  consider  nested  programs,  as  described  in  Section  2,  except  that  they 
do  not  allow  classical  negation.  (That  is,  their  programs  do  not  contain  the 
symbol  ~i.)  Accordingly,  they  define  answer  sets  using  sets  of  atoms  in  place  of 
consistent  sets  of  literails.  For  convenience,  the  term  “stable  model”  will  be  used 
to  refer  to  an  answer  set  in  their  sense. 

After  establishing  their  strong  equivalence  theorem  (with  respect  to  stable 
models)  for  nested  programs  without  classical  negation,  they  explain  that  the 
result  can  be  extended  to  all  nested  programs  as  follows.  Take  any  nested  pro¬ 
gram  P.  For  each  atom  A  in  the  language  of  P,  add  a  new  atom  A',  and  let  P'  be 
the  program  in  this  extended  language  obtained  by  (i)  replacing  each  occurrence 
of  each  negative  literal  “>A  with  atom  A',  and  (ii)  adding  the  rule  ±  A,  A'  for 
every  new  atom  A'.  The  answer  sets  for  P  are  in  one-to-one  correspondence  with 
the  stable  models  of  P'.  More  precisely,  given  any  set  X  of  literals,  let  X'  be 
obtained  by  replacing  each  negative  literal  lA  €  A  by  A'.  Then  X  is  an  answer 
set  for  P  iff  A'  is  a  stable  model  of  P'. 

It  follows  that  nested  programs  P  and  Q  are  strongly  equivalent  (in  the 
sense  of  this  paper)  iff  P'  and  Q'  are  strongly  equivalent  wrt  stable  models. 
Moreover,  for  any  nested  programs  P  and  Q  without  classical  negation,  P  and  Q 
are  strongly  equivalent  wrt  stable  models  iff  P'  and  Q'  are. 

In  [4],  an  HT-interpretation  is  a  pair  of  sets  of  atoms,  with  C 

Without  going  into  details,  we  can  observe  that  they  define  when  an  HT-inter- 
pretation  is  a  model  of  a  logic  program  in  the  sense  of  the  logic  of  here-and-there. 
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Although  it  is  not  done  here,  one  can  easily  verify  that  their  Lemmas  1  and  2 
together  imply  the  following. 

Proposition  1.  For  any  nested  logic  program  P,  {X^Y)  is  an  SE~model  of  P 
iff  {X\Y')  is  a  model  of  P'  in  the  logic  of  here-and-there. 

So  these  approaches  are  essentially  equivalent  with  regard  to  logic  programs. 
Each  has  advantages. 

The  primary  advantage  of  the  approach  introduced  here  is  its  relative  sim¬ 
plicity.  The  definition  of  SE-model  is  quite  straightforward,  based  on  concepts 
already  introduced  in  the  definition  of  answer  sets.  This  in  turn  simplifies  the 
proof  of  the  strong  equivalence  theorem.  Moreover,  the  (relatively)  simple  defi¬ 
nition  can  make  the  theorem  easier  to  apply  to  specific  cases. 

The  definition  introduced  in  this  paper  takes  advantage  of  the  special  status 
of  the  symbol  ^  in  definitions  of  logic  programming.  By  comparison,  the  logic 
of  here-and-there  treats  as  just  another  connective,  and  even  defines  not  in 
terms  of  it — not  F  is  understood  as  an  abbreviation  for  ±  <—  F.  The  possibility 
of  nested  occurrences  of  ^  complicates  the  truth  definition  considerably. 

It  is  important  to  note,  though,  that  this  complication  takes  a  familiar  form — 
the  truth  definition  in  the  logic  of  here-and-there  uses  standard  Kripke  models. 
In  fact,  they  are  a  special  case  of  Kripke  models  for  intuitionistic  logic  (which  is, 
accordingly,  slightly  weaker).  Thus,  such  an  approach  brings  with  it  a  range  of 
associations  that  may  help  clarify  intuitions  about  the  meaning  of  connectives  <— 
and  not  in  logic  programming. 

Even  if  we  consider  only  convenience  in  the  study  of  strong  equivalence  (or 
similar  properties),  the  logic  of  here-and-there  offers  a  potential  advantage:  it  is 
a  logic  with  known  identities,  deduction  rules,  and  such,  which  can  be  used  to 
establish  strong  equivalence  in  particular  cases. 

Nonetheless,  when  we  wish  to  apply  strong  equivalence  results,  it  seems  likely 
that  a  model-theoretic  argument  using  the  definition  from  this  paper  will  often 
be  easier  than  a  proof-theoretic  argument  using  known  properties  of  the  logic  of 
here-and-there. 

5  Equivalent  Transformations  of  Logic  Programs 

To  demonstrate  the  use  of  Theorem  1,  let  us  consider  again  the  example  from 
the  introduction:  for  any  NLP  formulas  F  and  G,  programs  Pi  and  P2  below 
have  the  same  SE-models. 

F;G  F  <—  not  G 

J_ P,  G  G  not  F 

1  ^P,G 

To  see  this,  take  any  pair  (A,  K)  of  consistent  sets  of  literals  such  that  X  C  y, 
and  consider  four  cases. 

Case  1:  y  1=  (F,  G)^.  Then  Y  is  not  closed  under  either  of  P^  or  P^,  so 

(X,y)  is  not  an  SE-model  of  Pi  or  P2. 
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Case  2:  y  1=  {F^notG)^ .  So  Y  is  closed  under  both  and  >  Notice 
that  not  does  not  occur  in  .  It  follows  that  since  Y  ^  G^  and  X  C  y, 
X  ^  G^ .  We  can  conclude  that  X  is  closed  under  P^  iff  X  |=  iff  X  is 
closed  under  So  (X,  y)  is  an  SE-model  of  Pi  iff  it  is  an  SE-model  of  P2. 
Case  3:  y  1=  {notF^G)^.  Symmetric  to  previous  case. 

Case  4:  y  1=  {notF,  not  G)^ .  Similar  to  first  case. 

When  strong  equivalence  is  characterized  using  the  logic  of  here-and- there, 
we  immediately  obtain  a  replacement  theorem:  strong  equivalence  is  preserved 
under  substitution  of  formulas  that  are  equivalent  in  the  logic  of  here-and-there. 
And  of  course  it  follows  that  if  formulas  F  and  G  are  satisfied  by  the  same  (here- 
and-there)  models  of  a  program  P,  then,  for  any  program  Q,  occurrences  of  F 
in  Q  can  be  replaced  by  G  without  affecting  the  answer  sets  of  P  U  Q.  One  can 
provide  a  similar  facility  using  SE-models.  Let  us  begin  with  two  definitions. 

We  say  that  NLP  formulas  F  and  G  are  equivalent  relative  to  logic  program  P 
if,  for  every  SE-model  (X,  y)  of  P,  X  (=  iff  X  |= 

An  occurrence  of  a  formula  is  regular  if  it  is  not  an  atom  preceded  by  -1. 

Theorem  2.  Let  P  be  a  program,  and  let  F  and  G  be  formulas  equivalent  rela¬ 
tive  to  P.  For  any  program  Q,  and  any  program  Q'  obtained  from  Q  by  replacing 
regular  occurrences  of  F  by  G,  programs  P  U  Q  and  PuQ^  are  strongly  equiva¬ 
lent 

The  restriction  to  regular  occurrences  is  essential.  For  example,  formulas  p 
and  q  are  equivalent  relative  to  program  P3  =  {p  <—  g,  9  <—  p},  yet  programs 
P3  U  {-ip}  and  P3  U  {-'Q'}  are  not  strongly  equivalent. 

Theorem  2  is  a  more  widely-applicable  version  of  Proposition  3  from  [3]. 
There  we  defined  equivalence  of  formulas  more  strictly,  and  did  not  make  it 
relative  to  a  program.  We  also  used  a  notion  of  “equivalence”  of  programs 
stronger  than  strong  equivalence.  Although  it  is  not  done  here,  a  proof  of  The¬ 
orem  2  can  be  easily  constructed  based  on  the  corresponding  proof  from  the 
earlier  paper.  (Section  9  does  include  a  similar  proof — of  the  corresponding  the¬ 
orem  for  “nested”  default  logic.)  Alternatively,  just  as  Proposition  1  related  the 
SE-models  (X,  y)  of  a  program  P  with  the  models  (X',y')  of  program  P'  un¬ 
der  the  logic  of  here-and-there,  one  can  show  that  NLP  formulas  F  and  G  are 
equivalent  relative  to  program  P  iff  the  corresponding  formulas  F'  and  G’  are 
satisfied  by  the  same  models  of  P'  in  the  logic  of  here-and-there. 

Many  formula  equivalences  are  proved  in  [3]  (Proposition  4),  and  of  course 
they  also  hold  under  our  (weaker)  definition  (relative  to  the  empty  program). 
Thus,  Theorem  2  implies,  for  instance,  that  replacing  subformulas  of  the  form 
not  (P,  G)  with  not  P;  not  G  yields  a  strongly  equivalent  program. 

For  another  example  using  Theorem  2,  observe  that  for  any  program  Q, 
and  any  program  Q'  obtained  from  Q  by  replacing  occurrences  of  not  P  by  G 
and/or  not  G  by  P,  programs  P2  U  Q  and  P2  U  Q'  are  strongly  equivalent. 
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6  Nested  Default  Logic 

For  the  study  of  strong  equivalence  of  default  theories,  it  is  convenient  to  in¬ 
troduce  a  “nested”  version  of  default  logic  that  generalizes  disjunctive  default 
logic  [2],  which  in  turn  generalizes  Reiter’s  default  logic. 

The  relatively  uniform  syntax  of  nested  default  logic  will  make  it  more  con¬ 
venient  for  stating  and  using  strong  equivalence  results.  (We  don’t  have  to  deal 
separately  with  a  prerequisite  and  a  set  of  justifications — ^they  are  expressed  in 
a  single  formula.) 

As  one  might  expect,  the  definitions  for  nested  default  logic  are  almost  exactly 
as  for  nested  logic  programs — essentially,  allow  arbitrary  formulas  of  classical 
logic  in  place  of  literals,  and  use  consistent,  logically  closed  sets  of  formulas  in 
place  of  consistent  sets  of  literals.  Accordingly,  the  strong  equivalence  theorem 
(and  its  proof!)  is  nearly  identical  too. 

6.1  Syntax 

Let  us  say  classical  formula  to  mean  a  formula  of  classical  propositional  logic. 

NDL  formulas  are  built  from  classical  formulas  using  the  unary  connec¬ 
tive  not  (negation  as  failure)  and  the  binary  connectives  |  (strong  disjunction) 
and  A  (conjunction).  (There  is  no  need  for  a  distinct  “strong  conjunction”  con¬ 
nective.)  An  NDL  rule  is  an  expression  of  the  form 

G 

where  F  and  G  are  NDL  formulas,  called  the  condition  and  the  conclusion  of 
the  rule. 

A  nested  default  theory  is  a  set  of  NDL  rules. 

When  convenient,  a  rule  of  the  form  will  be  identified  with  formula  F. 


6.2  Semantics 

Let  us  use  the  term  candidate  set  for  a  consistent  set  of  classical  formulas  that 
is  closed  under  classical  propositional  logic. 

We  can  recursively  define  when  a  candidate  set  X  satisfies  an  NDL  formula  F 
(symbolically,  X  [=  F) ,  as  follows. 

-  For  classical  F,  X  |=  F  iff  F  G  X . 

-  X  ^  (F  A  G)  iff  X  h  and  X  t=  G. 

-  X\={F\G)iEX[=FoTX^G. 

-  X\=notF\SX^F. 

rp 

A  candidate  set  X  is  closed  under  a  default  theory  P  if,  for  every  rule 
in  F,  X  [=  F  implies  X  \=  G. 

The  reduct  of  an  NDL  formula  F  relative  to  a  candidate  set  X  (written  F^) 
is  obtained  by  replacing  every  maximal  occurrence  in  F  of  a  formula  of  the 
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form  not  G  with  ±  if  |=  G  and  with  T  otherwise.  The  reduct  of  a  default 
theory  P  relative  to  X  (written  P^)  is  obtained  by  replacing  the  condition  and 
conclusion  of  each  rule  in  P  by  their  reducts  relative  to  X. 

A  candidate  set  X  is  an  extension  of  P  if  it  is  minimal  among  the  candidate 
sets  closed  under  P^ . 


6.3  Relation  to  (Nested)  Logic  Programming 

Essentially,  nested  logic  programming  is  a  special  case  of  nested  default  logic. 
Every  NLP  formula  F  corresponds  to  the  NDL  formula  d{F)  obtained  by  re¬ 
placing  occurrences  of  the  connectives  ;  and  ,  with  |  and  A  respectively.  A  nested 
logic  program  corresponds  to  the  default  theory  obtained  by  replacing  each  NLP 

rule  F  <—  G  with  .  A  consistent  set  of  literals  corresponds  to  the  candidate 

set  whose  formulas  are  its  consequences  (in  classical  logic). 

Proposition  2.  The  answer  sets  for  any  nested  logic  program  correspond  to  the 
extensions  of  the  corresponding  nested  default  theory. 


6.4  Relation  to  (Disjunctive)  Default  Logic 

Nested  default  logic  generalizes  disjunctive  default  logic  [2],  which  in  turn  gener¬ 
alizes  Reiter’s  default  logic.  Here  we  review  the  definition  of  disjunctive  default 
logic  and  relate  it  to  nested  default  logic. 

A  disjunctive  default  rule  is  an  expression  of  the  form 

Oi  .  Px } .  . .  )  Prn  /-  \ 

7l|---|7n 

where  a,y0i, . . . , /?Tn>7i,  •  •  •  }7n  are  classical  formulas  (m  >  0, n  >  1).  Reiter’s 
default  logic  corresponds  to  the  special  case  when  n  =  1.^ 

A  disjunctive  default  rule  (1)  corresponds  to  the  NDL  rule 

a  A  not  -ift  A  •  •  •  A  not 
7l|---|7n 

A  disjunctive  default  theory  is  a  set  of  disjunctive  default  rules. 

Let  P  be  a  disjunctive  default  theory  and  X  a  set  of  classical  formulas.  Define 


P^  = 


a  : 

7l|---|7n 


Q:  :  ^1 , . . . ,  (dm 

7l|--'|7n 


G  P  and  . . . ,  -^Pm 


A  set  Y  of  classical  formulas  is  closed  under  P^  if,  for  every  member  of  P^ ,  if 
a  gY  then  at  least  one  of  71, . . . ,  7n  belongs  to  Y. 

We  say  X  is  an  extension  of  P  if  X  is  minimal  among  sets  of  formulas  closed 
under  propositional  logic  and  closed  under  P^. 


^  In  Reiter’s  formulation,  a  default  theory  is  a  pair  (P,  W),  where  the  second  compo¬ 
nent  W  is  a  set  of  classical  formulas.  Here  we  suppress  the  second  component,  since 
every  <f>  GW  can  be  equivalently  represented  in  P  by  the  rule 
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Proposition  3.  A  candidate  set  X  is  an  extension  of  a  disjunctive  default  the¬ 
ory  P  iff  it  is  an  extension  of  the  corresponding  nested  default  theory. 

Proposition  3  restricts  attention  to  candidate  sets  (which  are  by  definition 
consistent)  because,  unlike  nested  default  logic,  disjunctive  default  logic  allows 
for  the  possibility  of  an  inconsistent  extension. 

7  Strong  Equivalence  of  Default  Theories 

Nested  default  theories  P  and  Q  are  equivalent  if  they  have  the  same  extensions. 
They  are  strongly  equivalent  if,  for  any  nested  default  theory  R,  P  U  R  and  QU  R 
are  equivalent. 

For  nested  default  theory  P,  and  candidate  sets  X,  Y  with  X  C  y,  the 
pair  (X,  y)  is  an  SE-model  of  P  if  both  X  and  Y  are  closed  under  P^. 

Theorem  3.  Nested  default  theories  are  strongly  equivalent  iff  they  have  the 
same  SE-models. 

A  proof  of  Theorem  3  is  easily  obtained  from  the  proof  of  Theorem  1,  and 
so  is  not  presented  in  this  paper,  (Essentially,  replace  references  to  “consistent 
sets  of  literals”  with  references  to  “candidate  sets.”) 

The  proof  shows  that  any  two  nested  default  theories  that  are  not  strongly 
equivalent  can  be  distinguished  by  adding  a  nested  default  theory  in  which  the 
conditions  and  conclusions  of  all  rules  are  classical  formulas. 


8  Equivalent  Transformations  of  Default  Theories 

As  with  logic  programs  (using  Theorem  1),  Theorem  3  can  be  used,  for  example, 
to  show  that  in  any  default  theory  containing  the  rule 

FAG 
±  ’ 

the  rule 

T 

F  I  G 

can  be  replaced  by  the  rules 

not  F  not  G 
G  ’  P  ' 

Moreover,  it  is  clear  that  replacing  any  occurrence  of  one  classical  formula 
with  another  that  is  logically  equivalent  (in  classical  logic)  yields  a  strongly 
equivalent  default  theory. 

We  can  formulate  an  additional  replacement  theorem,  similar  to  Theorem  2 
for  logic  programming,  thus  extending  our  account  of  when  an  occurrence  of  one 
formula  may  be  safely  replaced  by  another.  Again  we  need  some  definitions  first. 
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We  say  that  NDL  formulas  F  and  G  are  equivalent  relative  to  nested  default 
theory  P  if,  for  every  SE-model  {X,  F)  of  P,  X  \=  X  ^  . 

An  occurrence  of  a  subformula  in  an  NDL  formula  is  called  regular  if  it  is 
not  a  proper  subpart  of  an  occurrence  of  a  subformula  formed  by  an  application 
of  -1  or  V. 

Theorem  4.  Let  P  be  a  nested  default  theory,  and  let  F  and  G  be  formulas 
equivalent  relative  to  P.  For  any  nested  default  theory  Q,  and  any  nested  default 
theory  Q'  obtained  from  Q  by  replacing  some  regular  occurrences  of  F  by  G, 
nested  default  theories  PUQ  and  P  UQ'  are  strongly  equivalent. 

As  with  Theorem  2,  the  restriction  to  regular  occurrences  is  essential.  (And 
essentially  the  same  example  shows  this.) 

Theorem  4  can  be  used  to  show,  for  example,  that  in  any  nested  default 
theory  with  rules 

not  F  not 
~~^F  ’  F 

any  occurrence  of  NDL  formula  F\G  (for  any  classical  formula  G)  can  be  safely 
replaced  with  Fw  G. 

9  Proofs  Related  to  Nested  Default  Logic 

Proposition  2.  The  answer  sets  for  any  nested  logic  program  correspond  to  the 
extensions  of  the  corresponding  nested  default  theory. 

For  any  candidate  set  X,  let  1{X)  denote  the  set  of  all  literals  in  X. 

Lemma  1.  For  any  NLP  formula  F  and  candidate  set  X,  1{X)  \=  F  iff 
X\^d{F). 

Proof  Straightforward,  by  structural  induction.  □ 

Lemma  2.  For  any  NLP  formula  F  and  amdidate  set  X,  d{F^^^^)  =  d{F)^ . 

Proof.  Follows  easily  from  Lemma  1  and  the  definitions.  □ 

Lemma  3.  For  any  nested  logic  program  P  and  candidate  sets  X  and  Y,  1{X)  is 
closed  under  iff  X  is  closed  under  d{P)^ . 

Proof  Follows  easily  from  Lemmas  1  and  2,  and  the  definitions.  □ 

Proof  of  Proposition  2:  Take  any  nested  logic  program  P.  Assume  X  is  an 
answer  set  for  P.  So  X  is  a  consistent  set  of  literals  closed  under  P^,  and  no 
proper  subset  of  X  is  closed  under  P^ .  Let  Y  be  the  candidate  set  corresponding 
to  X.  By  Lemma  3,  Y  is  closed  under  d{P)^ .  Suppose  a  candidate  set  Z  with 
Z  C  y  is  closed  under  d{P)^ .  By  Lemma  3,  1{Z)  is  closed  under  P^ .  Since 
Z  CY,  1{Z)  C  X.  We  conclude  by  choice  of  X  that  1{Z)  =  X.  It  follows  that 
^  =  F-  So  y  is  an  extension  of  d(P) .  Proof  in  the  other  direction  is  similar.  □ 
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Proposition  3.  A  candidate  set  X  is  an  extension  of  a  disjunctive  default  the¬ 
ory  P  iff  it  is  an  extension  of  the  corresponding  nested  default  theory. 

Proof  It  is  clear  that  for  any  disjunctive  default  theory  P  and  corresponding 
nested  default  theory  Q,  and  any  candidate  sets  X  and  Y,  X  is  closed  under  P^ 
iff  X  is  closed  under  ,  from  which  the  result  follows.  □ 

Theorem  4.  Let  P  be  a  nested  default  theory,  and  let  F  and  G  be  formulas 
equivalent  relative  to  P.  For  any  nested  default  theory  Q,  and  any  nested  default 
theory  Q  obtained  from  Q  by  replacing  some  regular  occurrences  of  F  by  G, 
nested  default  theories  PUQ  and  PUQ'  are  strongly  equivalent. 

The  proof  of  Theorem  4  is  very  similar  to  the  proof  of  Proposition  3  from  [3], 
and  illustrates  how  a  proof  of  Theorem  2  might  go. 

We  begin  with  an  easily  verified  lemma. 

Lemma  4.  For  any  NDL  formula  F  and  candidate  set  X,  X  \=^  F  iff  X  \=  F^ . 

Lemma  5.  Let  F  and  G  be  NDL  formulas  equivalent  relative  to  nested  default 
theory  P.  If  an  NDL  formula  H'  can  be  obtained  from  an  NDL  formula  H  by 
replacing  some  regular  occurrences  of  F  by  G,  then  H  and  H'  are  equivalent 
relative  to  P. 

Proof.  Consider  any  SE-model  (X,  T)  of  P.  We  need  to  show  that  X  \=  iff 
X  1=  Proof  is  by  structural  induction  on  H. 

Case  1:  if  is  an  atom  or  if  =  lifi  or  if  -  if i  V  if 2.  Then  the  only  regular 
occurrence  of  a  formula  in  if  is  if  itself.  Consequently  H  =  F  and  H'  ~  G,  and 
we’re  done. 

Case  2:  if  =  ifi  A  if2.  If  if  =  F  and  H'  ~  G  we’re  done.  Otherwise, 
H'  ~  h  H2  and,  by  the  induction  hypothesis,  ifi  and  ifj  are  equivalent 
relative  to  P,  as  are  and  if^.  Then 

X\=H^  iff  X  1=  (ifi  A  ifs)^ 

iff  x\=hY  ahY 

iff  X  [=  HY  and  X  \=  HY 
iff  X  H  {Hi)^  and  X  h  (^2)^ 
iS  X^{Hi)y  A  {H'r 
iff  Xl=(i^{  Aif^^ 
iff  X  h  (if')^. 

Case  3:  H  =  ifi|if2.  Similar  to  Case  2. 

Case  4:  if  =  no^  Pi.  If  if  =  F  and  H'  -  G  we’re  done.  Otherwise, 
if'  =  not  H'l  and,  by  the  induction  hypothesis.  Pi  and  H[  are  equivalent  relative 
to  P.  Assume  that  X  |=  (notHi)^.  Then  {not  Pi)^  =  T,  so  T  ^  Pi.  It  follows 
by  Lemma  4  that  Y  ^  HY •  Since  (X,  K)  is  an  SE-model  of  P,  so  is  {Y,Y). 
Since  Pi  and  P(  are  equivalent  relative  to  P,  we  can  conclude  that  Y  ^  (PO^- 
By  Lemma  4,  T  H[.  So  {notHY)^  =  T,  and  thus  X\=  (notF)^.  The  other 
direction  is  symmetric.  n 
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Proof  of  Theorem  4-  Assume  that  Q'  can  be  obtained  from  Q  by  replacing 

some  regular  occurrences  of  F  by  a  formula  G  that  is  equivalent  relative  to  P. 

We  must  show  that  PUQ  and  PUQ'  have  the  same  SE-models. 

Consider  any  SE-model  (X,y)  of  P.  It  is  enough  to  show  that  both  X 

and  Y  are  closed  under  iff  both  are  closed  under  So  consider  any  rule 

Hi  H' 

e  Q,  along  with  the  corresponding  rule  e  Q'.  By  Lemma  5,  X  (=  (Hi)^ 

iff  X  ^  {H[)^ ,  and  similarly  X  |=  (^2)^  iff  X  |=  {H^)^ .  We  conclude  that  X  is 
closed  under  iff  it  is  closed  under  {Q')^ .  Since  {Y^Y)  is  also  an  SE-model 
of  P,  the  same  argument  can  be  used  to  show  that  Y  is  closed  under  iff  it  is 
closed  under  (Q')^-  □ 
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Abstract.  In  this  paper,  the  expressive  power  of  disjunctive  rules  in¬ 
volving  default  negation  is  analyzed  within  a  framework  based  on  polyno¬ 
mial,  faithful  and  modular  (PFM)  translations.  The  analysis  is  restricted 
to  the  stable  semantics  of  disjunctive  logic  programs.  A  particular  inter¬ 
est  is  understanding  what  is  the  effect  if  default  negation  is  allowed  in  the 
heads  of  disjunctive  rules.  It  is  established  in  the  paper  that  occurrences 
of  default  negation  can  be  removed  from  the  heads  of  rules  using  a  PFM 
translation  when  default  negation  is  allowed  in  the  bodies  of  rules.  In 
this  case,  we  may  conclude  that  default  negation  appearing  in  the  heads 
of  rules  does  not  affect  expressive  power  of  rules.  However,  in  the  case 
that  default  negation  may  not  be  used  in  the  bodies  of  rules,  such  a  PFM 
translation  is  no  longer  possible.  Moreover,  there  is  no  PFM  translation 
for  removing  default  negation  from  the  bodies  of  rules.  Consequently, 
disjunctive  logic  programs  with  default  negation  in  the  bodies  of  rules 
are  strictly  more  expressive  than  those  without. 


1  Introduction 

Logic  programming  with  answer  sets  [6,7]  as  proposed  by  Gelfond  and  Lifschitz 
has  been  recently  recognized  as  a  logic  programming  paradigm  of  its  own  [22,23]. 
This  is  mainly  because  problems  from  many  domains  such  as  planning  [18],  con¬ 
figuration  [30]  and  verification  [9]  have  attractive  formulations  as  logic  programs 
under  the  answer  set  semantics  [6] .  Much  of  the  promise  of  the  paradigm  is  also 
due  to  efficient  implementations  [17,24]  that  currently  allow  computing  answer 
sets  for  logic  programs  with  thousands  of  rules.  Being  able  to  handle  programs  of 
this  scale  has  already  turned  out  to  be  sufficient  to  enable  industrial  applications 
of  the  answer  set  programming  approach. 

Our  interest  in  answer  set  programming  is  comparing  the  expressive  powers 
of  various  types  of  rules  that  have  been  introduced  by  the  logic  programming 
community.  This  paper  can  be  viewed  as  a  continuation  of  previous  work  on 

*  A  preliminary  version  of  this  paper  was  presented  at  the  5th  Dutch-German 
Workshop  on  Nonmonotonic  Reasoning  Techniques  and  their  Applications 
(DGNMR’Ol). 
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the  expressive  power  of  non-monotonic  logics  [8,10,12,13,14].  The  author  [15] 
extends  similar  techniques  for  some  syntactically  restricted  classes  of  logic  pro¬ 
grams.  The  analysis  is  based  on  the  existence  of  polynomialy  faithful  and  modular 
(PFM)  translation  functions  between  classes.  This  gives  rise  to  a  hierarchy  of 
classes  of  logic  program  ordered  by  expressive  power.  However,  the  results  pre¬ 
sented  in  [15]  are  limited  to  very  special  subclasses  of  normal  logic  programs, 
since  the  goal  is  studying  how  the  number  of  positive  body  literals  affects  the 
expressiveness  of  rules.  In  this  paper,  more  general  classes  of  logic  programs  in¬ 
volving  disjunction  are  taken  into  consideration.  The  semantics  of  programs  in 
these  classes  is  determined  by  respective  generalizations  [7,19]  of  the  answer  set 
semantics  [6]. 

Historically  speaking,  the  answer  set  semantics  has  its  roots  in  the  stable 
model  semantics  [5]  of  normal  logic  programs  (also  known  as  general  logic  pro¬ 
grams  [20]).  This  class  is  obtained  from  ordinary  logic  programs  (that  consist  of 
rules  that  are  effectively  Horn  clauses)  by  allowing  the  use  of  a  form  of  negation 
-  negation  as  failure  to  prove  [20]  -  in  the  bodies  of  rules.  Due  to  close  inter¬ 
connections  to  Reiter’s  default  logic  [28],  this  form  of  negation  is  also  known 
as  default  negation.  Default  negation  differs  from  classical  negation  and  it  is 
therefore  quite  natural  that  Gelfond  and  Lifschitz  proposed  a  logic  program¬ 
ming  approach  with  both  negations  [6].  This  is  how  the  answer  set  semantics 
originated  as  a  generalization  of  the  stable  model  semantics.  Later  on,  Gelfond 
and  Lifschitz  extended  the  answer  set  semantics  to  cover  disjunctive  logic  pro¬ 
grams  with  classical  negation  [7]  (Przymusinski  [26]  presented  similar  ideas,  but 
in  a  more  general  setting).  The  latest  generalization  [19,18]  to  answer  set  pro¬ 
gramming  allows  occurrences  default  negation  in  the  heads  of  disjunctive  rules 
as  well. 

In  this  paper,  we  restrict  ourselves  to  the  class  of  disjunctive  logic  programs 
without  classical  negation  and  use  PFM  translation  functions  to  evaluate  the 
effects  of  extending  the  rule  language  with  default  negation  (i)  in  the  bodies 
of  rules,  (ii)  in  the  heads  of  rules,  and  (iii)  in  both.  The  rest  of  the  paper  is 
organized  as  follows.  Section  2  gives  a  brief  introduction  to  disjunctive  logic 
programs  and  the  stable  model  semantics.  In  Section  3,  we  present  the  analysis 
method  based  on  PFM  translation  functions.  The  method  is  then  applied  in 
Section  4  to  evaluate  the  effects  of  default  negation  on  the  expressiveness  of 
disjunctive  rules.  After  that  some  comparisons  with  related  work  are  performed 
in  Section  5.  Finally,  the  paper  ends  with  a  discussion  in  Section  6. 

2  Disjunctive  Logic  Programs 

In  this  paper,  we  consider  disjunctive  logic  programs  in  the  propositional  case^. 
We  let  ~  stand  for  default  negation  in  order  to  distinguish  it  from  classical 
negation  -i.  Given  a  (propositional)  atom  a,  we  define  positive  and  negative 

Disjunctive  programs  with  variables  are  also  covered  through  Herbrand  instantiation. 

In  the  presence  of  function  symbols,  Herbrand  instantiation  produces  an  infinite  (but 

countable)  propositional  program  out  of  a  finite  disjunctive  program  with  variables. 
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literals  as  expressions  of  the  forms  a  and  ^a,  respectively.  To  handle  sets  of 
negative  literals  nicely,  we  define  =  {~a  |  a  G  ^4}  for  a  set  of  atoms  A.  In 
general,  a  disjunctive  logic  program  P  is  a  set  of  rules  of  the  form 

A  V  ~P  <—  C  A  '^D  (1) 

where  A^  C  and  D  are  sets  of  atoms.  The  literals  in  A  U  form  the  head 
of  the  rule  while  the  literals  in  C  U  form  the  body  of  the  rule.  The  intuition 
behind  a  rule  of  the  form  (1)  is  that  if  all  the  atoms  in  C  can  be  inferred  and 
none  of  the  atoms  in  D  can  be  inferred,  then  one  of  the  atoms  in  A  can  be 
inferred  or  one  of  the  atoms  in  B  cannot  be  inferred.  This  is  how  the  head  of 
the  rule  is  interpreted  disjunctively  while  the  body  is  subject  to  a  conjunctive 
interpretation^.  The  Herbrand  base  Hb(P)  of  a  disjunctive  logic  program  P  is 
the  set  of  atoms  that  appear  in  P.  The  class  of  all  disjunctive  logic  programs  is 
denoted  by  X>.  A  disjunctive  logic  program  P  is  positive  if  all  rules  (1)  of  P  satisfy 
P  =  0  and  P  =  0.  Quite  similarly,  a  program  P  is  head-positive  (alternatively 
body-positive)^  if  all  rules  (1)  of  P  satisfy  P  =  0  (alternatively  P  =  0).  The 
respective  classes  of  disjunctive  logic  programs  are  denoted  by  P"^,  and 
P^"^.  These  definitions  imply  that  P"*"  C  C  P  and  T>^  C  P^"*"  C  P. 

2.1  Stable  Models  and  Answer  Sets 

Because  this  paper  is  restricted  to  classes  of  disjunctive  logic  programs  without 
classical  negation,  the  forthcoming  definition  of  stable  models  coincides  with 
that  of  answer  sets  [19] .  The  standard  way  to  define  the  semantics  of  a  positive 
disjunctive  logic  program  P  is  to  distinguish  models  of  P  that  are  minimal  as 
follows.  An  interpretation  of  P  is  simply  a  subset  of  Hb(P)  and  a  rule  A  ^  C  of  P 
is  satisfied  in  an  interpretation  I  C  Hb(P)  of  P  if  C  C  7  implies  An7  ^  0.  A  set  of 
atoms  M  C  Hb(P)  is  a  model  of  P  if  all  rules  of  P  are  satisfied  in  M.  A  model  M 
of  P  is  a  (subset)  minimal  model  of  P  if  there  is  no  model  M'  of  P  such  that 
M'  C  M.  By  this  definition,  it  is  possible  that  a  positive  disjunctive  program  has 
no  minimal  models  (Pi  —  {a  a}),  a  unique  minimal  model  (P2  =  {a  ^}) 

or  even  several  minimal  models  (P3  =  {aVb<— }).  Bya  slight  abuse  of  notation, 
we  write  M  ~  Mm(P)  to  declare  that  M  is  one  of  the  minimal  models  of  a 
positive  disjunctive  logic  program  P.  Thus  we  may  write  Mi  —  {a}  =  Mm(P3) 
as  well  as  M2  =  {b}  =  Mm(P3)  although  these  models  are  not  unique. 

The  stable  model  semantics  of  disjunctive  logic  programs  is  obtained  via 
the  Gelfond-Lifschitz  reduction  of  a  disjunctive  logic  program  P  [7,18,19]  which 
presumes  a  model  candidate  M.  The  reduced  program 

P^  =  {A  ^  C I A  V  ~P  <-  C  A  -P  E  P,  P  C  M,  and  P  n  M  =  0}  (2) 

is  a  positive  one.  A  model  M  of  a  disjunctive  logic  program  P  is  stable  if  M  is 
a  minimal  model  of  (not  necessarily  a  unique  one),  i.e.,  M  =  Mm(P^). 

^  Rather  than  using  a  set-based  notation  (1),  heads  and  bodies  of  rules  are  often 
written  as  disjunctions  and  conjunctions,  respectively.  For  instance,  when  A  =  {a}, 
B  =  {b},  C  =  {c},  and  D  =  {d},  we  write  a  V  ~b  <—  c  A  for  the  rule  (1). 
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Example  1.  Consider  logic  programs  Pi  =  {a  V  ~a  <— }  [19]  and  P2  =  {a  ^  a}. 
The  former  has  two  stable  models  Mi  =  {a}  and  M2  =  0  while  M2  is  the  unique 
stable  model  of  P2.  Note  that  Mi  is  also  a  model  of  P2,  but  not  a  minimal  one. 

The  program  Pi  illustrates  how  the  negative  literal  ^a  in  the  head  lets  us 
express  succinctly  a  choice  regarding  a:  either  a  is  in  the  model  (a  G  Mi)  or  a 
is  not  in  the  model  (a  ^  M2).  Simons  [29]  achieves  the  same  effect  by  enriching 
normal  logic  programs  with  choice  rules.  Note  that  due  to  the  negative  literal 
~a  in  the  head  of  the  only  rule  of  Pi,  the  stable  models  Mi  and  M2  of  Pi  break 
the  well-known  anti-chain  property:  Mi  C  M2  does  not  imply  Mi  =  M2. 

3  Polynomial,  Faithful  and  Modular  Translations 

In  this  paper,  we  employ  a  framework  of  polynomial,  faithful  and  modular  trans¬ 
lation  functions  for  comparing  the  expressive  powers  of  classes  C  of  logic  pro¬ 
grams  [15].  Some  basic  assumptions  are  imposed  on  any  class  C  of  logic  programs. 
First  of  all,  the  class  C  is  supposed  to  be  closed  under  unions,  i.e.,  given  any  two 
programs  P  and  P'  from  C,  then  also  PUP'  belongs  to  C.  On  the  other  hand, 
it  is  assumed  that  C  has  a  semantic  operator  Seme  associated  with  it.  The  op¬ 
erator  Seme  assigns  a  set  of  interpretations  I  C  Hb(P)  to  each  program  P  of  C. 
T3q)ically,  these  interpretations  are  distinguished  models  of  P.  It  is  clear  that  the 
classes  P,  P+,  P^+,  and  P^+  satisfy  these  criteria.  The  semantic  operator  Seme 
is  the  same  for  each  class  C  of  these;  Seme  assigns  {M  C  Hb(P)  |  M  =  Mm(P^)} 
to  a  program  P  whenever  P  is  a  member  of  the  respective  class  C. 

In  the  following  definition,  we  list  the  general  requirements  for  a  translation 
function  Tr  that  transforms  logic  programs  P  of  one  class  C  into  logic  programs 
Tr(P)  of  another  class  C' .  The  latter  class  is  assumed  to  be  a  subclass  or  a 
superclass  of  C.  We  let  ||P||  stand  for  the  length  of  P  in  symbols. 

Definition  1,  Given  two  classes  of  logic  programs  C  and  C'  that  are  closed  under 
unions  and  the  respective  semantic  operators  Seme  o^nd  Semc',  a  translation 
function  Tr  :  C  — ^  C'  zs 

-  polynomial  if  for  all  logic  programs  P  ^  C,  the  time  required  to  compute 
the  translation  Tr(P)  €  C  is  polynomial  in  ||P||, 

-  faithful  if  (i)  for  all  logic  programs  P  e  C,  the  base  Hb(P)  C  Hb(Tr(P)) 
and  (ii)  the  models/interpretations  in  Semc(P)  and  Seme/(Tr(P))  are  in  a 
one-to-one  correspondence  and  coincide  up  to  Hb(P),  and 

-  modular  if  (i)  for  all  logic  programs  Pi  G  C  and  P2  6  C,  the  translation 
Tr(Pi  UP2)  =  Tr(Pi)  UTr(P2)  and  (ii)  C'  C  C  implies  that  the  translation 
Tr(P')  =  P'  for  all  logic  programs  P'  G  C'. 

The  faithfulness  requirement  implies  that  a  translation  function  Tr  may 
introduce  new  atoms,  but  the  number  of  such  atoms  is  clearly  bounded  by 
the  polynomiality  requirement.  Let  us  also  note  that  if  Tr  is  faithful,  then 
Semc(P)  =  {M  D  Hb(P)  |  M  €  Semc'(Tr(P))}  holds.  The  first  part  of  the  mod¬ 
ularity  condition  enforces  locality  of  Tr,  since  the  translation  of  a  program  P1UP2 
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is  obtained  as  the  union  of  the  translations  of  the  subprograms  Pi  and  P2.  This 
implies  that  programs  can  be  translated  rule  by  rule.  The  second  part  handles 
cases  where  programs  of  a  class  C  are  translated  into  programs  in  a  proper  sub¬ 
class  C'  of  C.  Such  a  class  C'  is  typically  obtained  by  restricting  the  syntax  of 
the  rules  of  the  programs  in  C.  In  this  setting,  we  require  that  syntactically 
restricted  rules  remain  intact  by  a  translation  function.  Note  that  whenever 
C'  C  C  holds,  the  joint  effect  of  the  modularity  conditions  (i)  and  (ii)  is  that 
Tr{P'  UP)  =  P'  U  Tr(P)  holds  for  all  logic  programs  P'  €  C'  and  P  eC. 

We  say  that  a  translation  function  Tr  :  C  — »>  C'  is  PFM  if  it  satisfies  all 
the  three  criteria.  If  such  a  translation  function  exists,  we  write  C  pfm  C'  and 
consider  C'  as  expressive  as  C.  In  certain  cases,  we  can  find  a  counter-example 
which  proves  that  a  translation  function  satisfying  our  criteria  does  not  exist. 
We  use  the  notation  C  C'  in  such  cases.  Any  of  the  letters  P,  F,  and  M  may 
be  omitted  from  the  notation  if  the  corresponding  criterion  is  not  needed  in  the 
counter-example  (note  that  C~^  C'  implies  C  C\  for  instance). 

More  complex  relations  among  classes  of  logic  programs  can  be  deduced  from 
the  base  relations  ^  and  A  class  C  is  less  expressive  than  C'  (denoted  by 
C  ^  C')  if  C  ^  C'  and  C'  ^  C.  Classes  C  and  C'  are  equally  expressive 
(denoted  by  C  ^  C')  if  C  ^  C'  and  C'  ^  C.  Classes  C  and  C'  are  mutually 
incomparable  (denoted  by  C  C')  ifC  ^C'  and  C'  ^  C.  By  these  relations, 
we  have  accommodated  the  method  proposed  for  non-monotonic  logics  [13]  to 
the  case  of  logic  programs  (c.f.  [15]  for  a  discussion  on  the  main  differences). 

4  Expressive  Power  Analysis 

Recall  the  inclusions  C  C  V  stated  in  Section  2.  Since  the  semantic 
operators  of  these  classes  coincide,  it  follows  by  the  existence  of  an  identity 
translation  function  Trid  (i.e.,  Trid(P)  —  P  holds  for  any  P  from  or 
that  V'^  ^  and  ^  V,  but  the  strictness  of  these  relationships 

remains  open.  So  let  us  begin  our  analysis  by  establishing  pfm  . 

Theorem  1.  f'm  p+. 

Proof.  Consider  P  =  {a  ^a}  that  clearly  belongs  to  Then  suppose  there 
is  a  faithful  and  modular  translation  function  that  maps  P  to  a  positive  logic  pro¬ 
gram  Tr(P)  in  It  follows  by  the  faithfulness  of  Tr  that  Hb(P)  C  Hb(Tr(P)). 
In  addition,  the  translation  Tr(P)  does  not  have  minimal  models,  since  P  does 
not  have  stable  models.  This  implies  that  Tr(P)  has  no  models,  i.e.,  Tr(P)  is 
an  inconsistent  positive  logic  program.  Then  consider  P'  =  PU {a  ■«— }  for  which 
Tr(P')  =  Tr(P)  U{a  <— }  holds,  as  Tr  is  modular.  But  then  Tr(P')  does  not  have 
models  nor  minimal  models  so  that  P'  does  not  have  stable  models,  as  Tr  is 
faithful.  A  contradiction,  since  M  =  {a}  is  a  stable  model  of  P'.  □ 

Let  us  then  concentrate  on  establishing  that  P  pfm  which  implies  that 
P  PFM  P^'^ .  For  this  result,  we  have  to  find  a  way  to  translate  disjunctive  logic 
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programs  having  occurrences  of  default  negation  in  the  heads  of  rules  into  head¬ 
positive  disjunctive  logic  programs.  For  each  atom  a  €  Hb{P),  we  introduce  a 
new  atom  a°  which  is  to  mean  that  a  cannot  be  inferred  by  the  rules.  In  anal¬ 
ogy  to  [6],  the  atom  a°  can  be  understood  as  a  “positive  occurrence”  of  the 
negative  literal  ~a.  The  difference  is  that  we  apply  the  idea  to  remove  default 
negation  while  Gelfond  and  Lifschitz  aim  to  remove  negative  literals  formed 
with  classical  negation.  For  a  set  of  atoms  A  C  Hb(P),  we  let  A°  denote  the 
set  {a°  I  a  €  A}.  For  any  P  G  we  distinguish  a  particular  subset  of  Hb(P): 
Hd'^(P)  —  I  V  ~P  <r~  C  A  G  P}  is  the  set  of  atoms  that  appear  neg¬ 
atively  in  the  heads  of  the  rules  of  P.  Default  negation  can  be  removed  from  the 
heads  of  rules  using  a  translation  function  Trh+  to  be  defined  as  follows. 

Definition  2.  For  a  disjunctive  logic  program  P,  let  Trh-t-(P)  denote  the  trans¬ 
lation  of  P  into  a  head-positive  disjunctive  logic  program 

{4-  a  A  a°  ,  a°  4-  I  a  G  Hd~(P)}  U 
{A  U  P°  4-  G  A  -D I A  V  -B  ^  C  A  -D  G  P} 

Thus  Hb(Trh-}-(P))  =  Hb(P)  UHd'^(P)°.  Let  us  establish  that  Trh+  is  PFM. 

Theorem  2.  Let  P  be  a  disjunctive  logic  program.  If  M  C  Hb(P)  is  a  stable 
model  ofP,  then  M  U  (Hd'^(P)  -  M)°  is  a  stable  model  o/Trh+(P). 

Proof  Let  M  be  a  stable  model  of  P  and  M'  =  M  U  (Hd~(P)  -  M)°.  By  the 
definitions  of  Trh+(P)  and  M\  the  reduct  of  Trh+(P)  with  respect  to  M'  is 

{4-  a  A  a°  I  a  G  Hd'-(P)}  U  {a°  4-  |  a  G  Hd~(P)  -  M}  U 

U  B°  ^  C I A  U  B°  ^  <7  A  -B  G  Trh+  (P)  and  B  n  M'  =  0}. 

The  rules  of  the  forms  4-  a  A  a°  and  a°  4-  in  (4)  are  satisfied  in  M'  directly 
by  the  definition  of  M'.  Let  us  then  assume  that  some  of  the  rules  AU  B°  <r-  C 
in  (4)  is  not  satisfied  in  M\  i.e.,  C  C  M'  and  {A  U  B°)  D  M'  =  0.  It  follows  by 
the  definition  of  M'  that  C  C  M,  0  and  B  C  M.  Also  BHM  =  0  holds 

by  (4)  and  the  definition  of  M'.  Thus  the  rule  A  <- C  belongs  to  P^  and  it  is 
not  satisfied  by  M.  Thus  M  is  not  a  model  of  P^,  a  contradiction.  Hence  the 
rule  A  U  B°  4—  (7  is  satisfied  by  M'.  To  conclude,  we  have  established  that  M' 
is  a  model  of  the  reduct  (4).  It  remains  to  establish  the  minimality  of  M'. 

So  let  us  assume  that  M'  is  not  a  minimal  model  of  (4),  i.e.,  there  is  a 
model  N'  of  (4)  such  that  N'  cM'.  Now  N'  and  M'  must  coincide  on  the  atoms 
of  Hd~(P)°,  because  N'  c  M',  N'  is  a  model  of  (4),  and  the  rule  a°  <4-  is  included 
in  (4)  for  each  a  G  Hd~(P)  -  M.  Thus  N  CM  holds  for  N  =  N'nB.h{P).  Then 
assume  that  N  is  not  a  model  of  P^,  i.e.,  there  is  a  rule  A  4—  <7  G  P^  such 
that  C  C  N  and  A  fl  AT  =  0.  So  there  is  a  rule  A  V  r^B  4-  C  A  ~B  in  P  such 
that  B  C  M  and  B  n  M  =  0.  Consequently,  A  U  B°  4-  C  A  --B  belongs  to 
Trh+(P)  and  B  n  M'  =  0  implying  that  A  U  B°  4-  C  belongs  to  (4).  Moreover, 
it  follows  by  the  definitions  of  N  and  M'  and  the  relationship  N'  C  M'  that 
An  AT'  =  0,  B°  nAT'  =  0  and  (7  C  N'.  Thus  AuB°  4—  (7  is  not  satisfied  in  N'y  a 
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contradiction.  Hence  iV  is  a  model  of  .  Then  N  C  M  implies  that  M  is  not 
a  minimal  model  of  contradicting  the  stability  of  M.  Thus  M'  is  a  minimal 
model  of  (4),  i.e.,  a  stable  model  of  Trh+(P). 

Theorem  3.  Let  P  be  a  disjunctive  logic  program.  If  M'  C  Hb(P)  UHd'"(P)° 
is  a  stable  model  o/Trh+(P);  then  M  =  M'  Pi  Hb(P)  is  a  stable  model  of  P. 

Proof.  Let  M'  C  Hb(P)  U  Hd~(P)°  be  a  stable  model  of  Trh+(P)  and  define 
M  =  M'  n  Hb(P).  Consider  any  a  e  Hd""(P).  (i)  Suppose  that  a  €  M  and 
a°  G  M'.  Then  a  G  M'  and  ^  a  A  a°  G  Trh+(P)^  is  not  satisfied  in  M',  a 
contradiction,  (ii)  Then  assume  that  a  ^  M  and  a°  ^  M'.  Since  a  G  Hd"'(P)  C 
Hb(P)  and  M  =  M'  n  Hb(P),  it  follows  that  a  0  M'.  This  implies  that  a°  <- 
belongs  to  Trh-i-(P)‘^  .  Since  M'  is  a  model  of  Trh+(P)^  ,  it  holds  necessarily 
that  a°  G  M',  a  contradiction.  Now  (i)  and  (ii)  imply  for  any  a  G  Hd~(P)  that 
a  ^  M  O  a°  G  M'.  Thus  M' ==  M  U  (Hd'^(P)  -  M)°. 

Then  consider  any  rule  A  V  <r-~  C  A  of  the  original  program  P.  Now 
(iii)  A^CGP^<^PCMandPnM  =  04:^>P°nM'-=0andPnM'  =  0 

n  M'  =  0  and  A  U  -  C  e  Trh+(P)^'. 

Let  us  then  assume  that  M  is  not  a  model  of  P^ .  So  there  is  a  rule  A  (7 
in  P^  such  that  C  C  M  and  ^  D  M  =  0.  This  implies  by  (iii)  that  B°  nM'  =  0 
and  the  rule  AU  B°  <r-  C  belongs  to  Trh+(P)^  .  It  follows  that  C  C  M'  and 
{A  U  B°)  n  M'  —  0.  Thus  A\JB°  <—  C  is  not  satisfied  in  M',  i.e.,  M'  is  not  a 
model  of  Trh+(P)^  ,  a  contradiction.  Hence  M  is  a  model  of  P^ . 

Finally,  let  us  assume  that  M  is  not  a  minimal  model  of  P^.  Then  there  is  a 
model  N  of  P^  such  that  N  C  M.  Define  a  model  N'  =  iVU  (Hd'^(P)  —  M)°  so 
that  N'  C  M'  is  the  case.  Let  us  assume  that  N'  is  not  a  model  of  Trh+(P)  , 
i.e.,  the  reduct  contains  a  rule  which  is  not  satisfied  in  N'.  Three  cases  arise, 
(a)  A  rule  <—  a  A  a°  of  Trh+(P)^  is  false  in  N'.  This  implies  that  a  G  Hd'^(P), 
a  G  AT'  and  a°  G  N'.  By  the  relationship  N'  C  M'^  we  obtain  that  a  G  M'  and 
a°  G  M',  a  contradiction,  (b)  A  rule  a°  ^  of  Trh+(P)^  is  false  in  N'.  It  follows 
that  a  G  Hd'"(P)  and  a°  ^  N'  so  that  a°  ^  M'  holds,  as  N'  and  M'  coincide  on 
the  atoms  of  Hd'^(P)°.  Then  M'  is  not  a  model  of  Trh+(P)^  ,  a  contradiction, 
(c)  A  rule  AUB°  <—  C  of  Trh+(P)^  is  false  in  AT'.  It  follows  that  C  C  N'  and 
(A  U  P°)  n  AT'  —  0  so  that  C  C  N  and  A  fl  AT  =  0.  Moreover,  P°  D  M'  =  0  holds, 
as  N'  and  M'  coincide  on  the  atoms  of  Hd~(P)°.  Thus  A  C  belongs  to  P^ 
by  (iii).  In  addition,  this  particular  rule  is  not  satisfied  in  N  which  contradicts 
the  fact  that  A^  is  a  model  of  P^ . 

By  the  preceding  case  analysis,  N'  is  a  model  of  Trh+(P)^  ,  a  contradiction. 
Hence  M  is  a  minimal  model  of  P^  and  a  stable  model  of  P.  □ 

Theorem  4.  D  pfm 

Proof.  It  is  obvious  that  Trh-i-  is  polynomial  and  modular.  To  establish  faith¬ 
fulness  we  note  that  Theorem  2  gives  rise  to  a  mapping  fi  that  maps  a  stable 
model  M  of  P  to  a  stable  model  /i  (M)  =  Mu(Hd'"(P)  —  M)°  of  Trh+(P).  Then 
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consider  any  two  stable  models  M  and  N  oi  P  such  that  /i(M)  =  f2{N).  It  fol¬ 
lows  that  M  —  N  so  that  fi  is  injective.  On  the  other  hand,  a  mapping  /2  that 
maps  a  stable  model  M'  of  Trh-(-(P)  to  a  stable  model  /2(M')  =  M'  PI  Hb(P) 
of  P  is  obtained  from  Theorem  3.  If  we  have  two  stable  models  M'  and  N' 
of  Trh+(P)  such  that  M  ~  f2{M')  =  f2{N')  =  iV,  it  follows  by  the  proof  of 
Theorem  3  that  M'  =  M  U  (Hd~(P)  -  M)  =  N  U  (Hd~(P)  ~  N)  =  N'.  This 
indicates  that  /2  is  injective.  Thus  it  is  clear  that  /i  and  /2  are  bijective  and 
inverses  of  each  other.  Consequently,  the  stable  models  of  P  and  Trh+(P)  are  in 
a  one-to-one  correspondence  and  they  coincide  up  to  Hb(P).  □ 

Having  established  the  equivalence  of  T>  and  we  are  ready  to  proceed  to 
the  analysis  of  body-positive  programs.  Recall  that  any  P  £  is  a  set  of  rules 
of  the  form  AV^B  ^  C.  By  the  denial  of  negative  subgoals  in  the  bodies  of  rules, 
the  semantic  definitions  are  simplified  accordingly.  Given  P  €  and  a  model 
candidate  M  C  Hb(P),  the  reduct  P^  contains  a  rule  A  <—  C  whenever  B  C  M 
for  some  rule  A  V  '^B  C  E  P.  The  definition  of  stable  models  remains  intact, 
i.e.,  M  ~  Mm(P^).  However,  the  properties  of  P^  let  us  establish  interesting 
results  for  the  programs  of  as  follows. 

Lemma  1.  IfQe  P  C  Q,  and  Mi  C  M2  C  Hb(Q),  then  P^^  C 

Indeed,  the  reduct  P^  grows  monotonically  with  respect  to  P  and  M.  This 
is  in  contrast  with  head-positive  programs  P  E  that  satisfy  P^^  C  P^i 
for  Ml  C  M2  C  Hb(P).  The  monotonicity  properties  of  P^  let  us  to  extend 
well-known  properties  of  minimal  models  of  positive  disjunctive  programs  to 
cover  stable  models  of  body-positive  disjunctive  programs. 

Lemma  2.  If  P  E  and  M  C  Hb(P)  is  a  model  of  P^ ,  then  P  has  a  stable 
model  N  C  Hb(P)  such  that  N  C  M. 

Proof  sketch.  Let  M  C  Hb(P)  be  a  model  of  P^  for  P  E  Then  we  may 
use  transfinite  induction  to  construct  a  descending  sequence  of  interpretations 
Mq  D  Ml  D  M2  2  ■  •  •  such  that  (i)  Mq  =  M,  (ii)  M^  C  M^-i  can  be  chosen 
as  Mm(P‘^«-^)  for  a  successor  ordinal  a,  and  (iii)  M^  is  defined  as  the  limit 
n/3<Q  ^0  for  a  limit  ordinal  a.  The  construction  can  be  done  so  that  Ma  remains 
a  model  of  P^«  for  any  ordinal  a.  Moreover,  it  follows  for  a  sufficiently  large 
successor  ordinal  a  (|a|  >  |Hb(P)|)  that  Ma  =  Ma-i.  This  implies  by  (ii) 
that  Ma  ~  Mm(P‘^‘^)  so  that  N  ~  Ma  is  a  stable  model  of  P.  In  an  extreme 
case,  N  may  become  empty.  This  is  demonstrated  in  Example  2.  □ 

Example  2.  Consider  an  infinite  body-positive  disjunctive  logic  program  P  = 
{bi  V  ~bi_i  4-  I  i  >  0}.  It  is  clear  that  Mq  —  Hb(P)  =  {bj  |  z  >  0}  is  a  model  of 
P^o  =  {hi  I  i  >  1},  but  not  a  minimal  one,  as  Mi  =  {bj  |  z  >  1}  is  the  unique 
minimal  model  of  p^o  Similarly,  for  any  j  >  0,  Mj-i  =  {bj  |  z  >  j  —  1}  is  not 
a  minimal  model  of  P^i-i  =  {b*  4-  |  i  >  j},  but  Mj  =  {bi  |  i  >  j}  is.  It  follows 
that  ni>o  ^  =  0  is  a  stable  model  of  P.  This  is  obvious,  since  the 

reduct  P^  ~  0.  Note  that  N  is  in  fact  the  unique  stable  model  of  P.  □ 
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Proposition  1.  Consider  P  €  and  Q  G  such  that  P  C  Q.  (i)  If  Q 
has  a  stable  model  M  C  Hb(Q)  then  P  has  a  stable  model  N  C  Hb(P)  such  that 
N  CM.  (a)  If  P  has  no  stable  models,  neither  has  Q. 

Proof.  Suppose  that  M  is  a  stable  model  of  Q,  i.e.,  M  =  Mm(Q^).  Since  P^  C 
holds  for  M'  =  MnHb(P)  by  Lemma  1,  we  know  that  M  and  M'  are  models 
of  P^' .  Thus  P  has  a  stable  model  N  C  M'  C  M  by  Lemma  2.  The  claim  (ii) 
of  this  proposition  follows  easily  from  (i)  by  contrapositive  argumentation.  □ 

To  characterize  the  expressive  power  of  the  class  ,  we  note  that  pfm 
and  ptm  V  hold  directly  by  the  relationships  C  C  V  and 
the  identity  translation  function  Trid.  The  latter  relationship  is  shown  to  be  a 
strict  one  in  the  following  theorem.  Thus  body-positive  disjunctive  programs  are 
strictly  less  expressive  than  general  as  well  as  head-positive  disjunctive  programs, 
as  implied  by  the  fact  that  pS  V  and  Theorem  4. 

Theorem  5,  P  ^  . 

Proof.  Consider  P  =  {a  '^a}  from  X>.  Suppose  there  is  a  faithful  and  modular 
translation  function  Tr  that  maps  P  to  a  program  Tr(P)  of  Since  P  has 
no  stable  models,  neither  has  Tr(P)  by  the  faithfulness  of  Tr.  As  Tr  is  modular, 
we  know  that  Tr(P')  =  Tr(P)  U  {a  <— }  holds  for  P'  =  P  U  {a  <— }.  Thus  Tr(P') 
has  no  stable  models  by  Proposition  1.  This  contradicts  the  faithfulness  of  Tr, 
since  P'  has  a  unique  stable  model  M  =  {a}.  □ 

Let  us  then  address  the  relationship  p^  .  Our  last  theorem  provides 
a  concrete  counter-example  to  establish  that  body-positive  disjunctive  programs 
are  strictly  more  expressive  than  positive  ones,  i.e.,  pfm  holds. 

Theorem  6.  PFM 

Proof.  Consider  a  body-positive  logic  program  P  =  {a  V  ~b  <<— }.  Suppose  there 
is  a  PFM  translation  function  Tr  from  'D^~^  to  that  maps  P  to  a  positive 
program  Tr(P)  such  that  Hb(P)  C  Hb(Tr(P)).  It  follows  by  the  modularity  of 
Tr  that  PU{b  a}  is  translated  into  Tr(PU{b  a})  =  Tr(P)U{b  a}.  Note 
that  Hb(P  U  {b  ^  a})  =  Hb(P)  and  Hb(Tr(P  U  {b  ^  a}))  =  Hb(Tr(P)). 

Now  P  U  {b  ^  a}  has  two  stable  models  Mi  =  0  and  M2  =  {a,  b}.  This 
implies  by  the  faithfulness  of  Tr  that  Tr(P)  U  {b  ^  a}  has  exactly  two  minimal 
models  Ni  C  Hb(Tr(P))  and  N2  Q  Hb(Tr(P))  such  that  Mi  =-  iVi  D  Hb(P) 
and  M2  =  A^2  n  Hb(P).  It  follows  that  both  iVi  and  N2  are  models  of  Tr(P), 
but  not  necessarily  minimal  ones.  Consequently,  there  exist  minimal  models  N{ 
and  N2  of  Tr(P)  such  that  N[  C  Ni  and  N2  C  N2.  Since  P  has  a  unique 
stable  model  M  —  0,  it  follows  by  the  faithfulness  of  Tr  that  N[  and  must 
be  the  same  minimal  model  of  Tr(P),  say  N\  Moreover,  N'  C  Ni  H  N2  and 
N'  n  Hb(P)  =  M  =  0.  But  then  the  rule  b  a  is  satisfied  by  N'  which  is 
therefore  a  model  of  Tr(P)  U  {b  ^  a}  such  that  N'  C  N2.  Recall  that  Ni 
and  N2  form  an  antichain  as  minimal  models  of  Tr(P)  U  {b  a}.  It  follows  that 
N'  C  Ni  and  N'  C  N2~  contradicting  minimality  of  Ni  and  N2.  □ 
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5  Related  Work 

Antoniou  et  al.  [1]  apply  a  modularity  condition  when  developing  normal  forms 
for  Nute’s  defeasible  logic  [25].  Since  the  syntax  of  defeasible  logic  is  based  on 
rules,  too,  it  is  worth  comparing  their  notion  of  modularity  with  the  one  applied 
in  this  paper.  According  to  Antoniou  et  al.,  a  translation  function  Tr  is  modular, 
if  Di  U  D2  ^l(Di)ul(D2)  U  Tr(-D2)  for  any  defeasible  theories  Di  and  D2. 
Here  =  denotes  semantical  equivalence,  i.e.,  the  theories  yield  exactly  the  same 
conclusions  in  the  union  of  the  respective  languages  L{D\)  and  L{D2)  of  Di 
and  D2.  Similarly,  Tr  is  correct,  if  =jr,(jr))  Tr(i))  for  every  D,  and  incremental, 
if  DiUT)2  =l{Di)ul{D2)  Tr(Z)i)UTr(D2)  for  every  Di  and  D2.  Thus  any  modular 
transformation  is  also  incremental  and  correct  [1].  Note  that  the  part  (i)  of  our 
definition  of  modularity  in  Definition  1  corresponds  to  incrementality.  The  main 
difference  is  that  our  definition  of  modularity  is  purely  syntactical:  a  modular 
translation  need  not  be  faithful  (i.e.,  correct  in  the  terminology  of  Antoniou 
et  al.).  The  notions  of  faithfulness  differ,  too,  since  the  skeptical  semantics  of 
defeasible  theories  is  based  on  proofs  rather  than  models. 

Inoue  and  Sakama  [11]  present  an  alternative  way  for  removing  default  nega¬ 
tion  from  the  heads  of  rules.  Their  idea  is  to  translate  (1)  into  A*  UB*  CU^D 
where  A*  =  {a*  |  a  G  A}  and  B*  =  {a*  |  a  G  B}  are  sets  of  new  atoms.  In  ad¬ 
dition,  the  rules  a  ^  a*,  a*  ^  {a}  U  H,  ^  b  A  b*,  and  ^  a*  A  ~b  have  to 
be  introduced  for  each  a  G  A  and  b  G  B.  The  resulting  translation  function 
Ttis  is  clearly  modular,  but  quadratic  in  the  worst  case.  In  contrast  to  this,  the 
translation  function  Trh-f  in  Definition  2  is  linear.  The  translational  idea  behind 
Trh+  is  also  simpler  than  that  of  Tris*  Inoue  and  Sakama  [11,  Remark  6.3]  note 
anyway  that  the  stable  models  of  a  disjunctive  program  P  and  Tris(B)  are  in  a 
one-to-one  correspondence.  Thus  Tris  is  also  PFM  and  Theorem  4  is  also  implied 
by  the  results  in  [11].  Inoue  and  Sakama  [11,  Section  6.3]  propose  yet  another 
polynomial  and  modular  translation  function  for  removing  default  negation  from 
head-positive  programs.  However,  Theorem  1  implies  that  Tris  cannot  be  faith¬ 
ful.  This  explains  why  Inoue  and  Sakama  need  an  additional  stability  condition 
on  minimal  models  to  establish  faithfulness.  The  resulting  semantics  of  positive 
programs  is  expressive  enough  to  capture  head-positive  programs. 

Eiter  and  Gottlob  [4]  study  the  computational  complexity  of  disjunctive  logic 
programs  by  ranking  the  main  decision  problems  (brave  and  cautious  reasoning 
with  stable/minimal  models)  in  polynomial  time  hierarchy  (PH).  To  summarize 
their  results,  these  decision  problems  of  positive  and  head-positive  programs  are 
complete  problems  on  the  second  level  of  PH.  By  the  tight  semantical  corre¬ 
spondences  embodied  in  the  relationships  T>  ^  ^  X>^+  ^  V,  these 

results  extend  for  the  classes  allowing  default  negation  in  the  heads  of  rules,  too. 

Corollary  1.  For  disjunctive  programs  in  T>  and  ,  (i)  brave  reasoning  with 
stable  models  forms  a  complete  decision  problem,  and  (ii)  cautious  reasoning 
with  stable  models  forms  a  Tl^-complete  decision  problem. 

The  results  concerning  the  class  V  appeared  first  in  [11,  Theorem  6.4].  Fur¬ 
ther  differences  in  expressive  power  can  be  detected  if  the  computational  com- 
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plexity  of  checking  the  existence  of  stable  models  is  taken  into  consideration.  For 
disjunctive  programs  in  this  forms  a  S^-complete  decision  problem.  The 
same  holds  for  t>  by  the  relationship  V  pf^  as  well  as  [11,  Theorem  6.4]. 
However,  for  any  disjunctive  program  P  from  it  is  sufficient  to  find  one  (even 
non-minimal)  model  to  solve  this  decision  problem.  This  is  an  indication  of  the 
fact  that  the  problem  is  Sj-complete  (i.e.,  NP-complete)  [4].  By  Lemma  2  and 
the  relationship  we  may  conclude  that  the  corresponding  decision 

problem  is  also  Sf -complete  for  the  class  of  body-positive  programs 

6  Conclusions  and  Further  Research 

In  this  paper,  we  apply  a  framework  based  on  polynomial,  faithful  and  modular 
(PFM)  translation  functions  to  study  the  effect  of  default  negation  on  the  ex¬ 
pressiveness  of  disjunctive  rules.  Three  subclasses  of  the  class  of  disjunctive  logic 
programs  V  are  identified:  the  classes  of  positive  programs  P"*",  head-positive 
programs  and  body-positive  programs  To  summarize  the  relation¬ 

ships  established  by  Theorems  1,  4,  5,  and  6,  we  have  obtained  an  expressive 
power  hierarchy  (EPH)  for  disjunctive  programs  with  the  following  structure: 

^  ^  V  PF^  Therefore,  we  conclude  that  permitting  default 

negation  in  the  heads  of  rules  does  not  increase  the  expressive  power  of  rules 
given  that  default  negation  is  allowed  in  the  bodies  of  rules.  The  translation  func¬ 
tion  Trh+  given  in  Definition  2  removes  such  occurrences  of  default  negation  in 
a  straightforward  way.  However,  this  is  no  longer  possible  when  default  negation 
is  banned  in  the  bodies  of  rules  so  that  the  expressive  power  of  body-positive 
disjunctive  programs  exceeds  that  of  positive  disjunctive  programs.  Moreover,  it 
is  clear  by  the  structure  of  the  hierarchy  that  the  expressive  power  of  rules  is 
properly  increased  by  introducing  default  negation  in  the  bodies  of  rules. 

However,  the  expressive  powers  of  the  four  classes  are  the  same  if  measured  by 
the  computational  complexity  of  brave  and  cautious  resisoning  with  stable  mod¬ 
els.  This  follows  by  the  results  of  Eiter  and  Gottlob  [4],  Inoue  and  Sakama  [11], 
and  this  paper  (Corollary  1).  On  the  other  hand,  the  classes  V'^  and  can 
be  differentiated  from  the  classes  and  V  if  the  complexity  of  deciding  the 
existence  of  a  stable  model  is  taken  into  account,  but  V'^  and  remain  equiv¬ 
alent  even  under  this  additional  measure.  Since  pfm  holds,  we  conclude 
that  the  measure  beised  on  PFM  translations  provides  a  refined  view  on  the 
expressiveness  of  disjunctive  rules  involving  default  negation.  This  is  because 
polynomial  transformations  involved  in  PH  preserve  just  the  plain  yes/no  an¬ 
swers  of  decisions  problems.  Compared  to  this,  the  notions  of  faithfulness  and 
modularity  (c.f.  Definition  1)  constitute  a  much  stronger  constraint.  Let  us  also 
point  out  that  the  hierarchy  EPH  deduced  in  this  paper  remains  valid  even  in  the 
unlikely  event  that  the  complexity  classes  P  and  NP  coincide  and  PH  collapses. 

It  is  to  be  expected  that  the  results  of  this  paper  can  be  extended  and  gen¬ 
eralized  in  several  ways,  (i)  Currently,  our  results  do  not  cover  the  classes  of 
extended  disjunctive  programs  where  classical  literals,  i.e.,  atoms  a  and  their 
classical  negations  -la,  may  appear  wherever  atoms  appear  in  ordinary  disjunc- 
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tive  rules  (1).  In  order  to  generalize  our  framework  for  the  case  of  extended 
disjunctive  programs,  we  have  to  extend  languages  associated  with  disjunctive 
programs  and  revise  the  notion  of  faithfulness  accordingly.  The  basic  technique 
for  obtaining  translations  will  be  the  one  by  Gelfond  and  Lifschitz  [6]:  classical 
negative  literals  are  simply  rewritten  as  new  atoms,  (ii)  Furthermore,  it  will  be 
interesting  to  study  the  effect  of  integrity  constraints ^  i.e.,  rules  (1)  with  A  =  0, 
using  the  jframework  proposed  in  this  paper,  (iii)  The  current  notion  of  modular¬ 
ity  could  be  split  in  two,  i.e.,  notions  of  weak  and  strong  modularity.  The  latter 
would  correspond  to  the  current  notion  while  the  former  could  be  introduced  to 
strengthen  intranslatability  results.  For  instance,  the  proof  of  Theorem  1  remains 
valid  even  if  we  introduce  a  notion  of  modularity  requiring  that  head-positive 
rules  can  be  translated  in  separation  of  rules  that  are  not  head-positive,  (iv)  So 
far  our  analysis  covers  only  the  stable  semantics,  but  a  wide  variety  of  alter¬ 
native  semantics  for  disjunctive  logic  programs  have  been  proposed  (see,  e.g., 
[2,3,21,27,31]).  Due  to  our  recent  experiences  with  non-monotonic  logics  [14], 
we  expect  (in)translatability  results  regarding  other  semantics  as  well.  Our  first 
results  in  this  respect  on  Przymusinski’s  partial  stable  models  [26]  can  be  found 
in  [16]. 
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Abstract.  Schlipf  [Sch95]  proved  that  Stable  Logic  Programming  (SLP) 
solves  all  NP  decision  problems.  We  extend  Schlipf ’s  result  to  prove  that 
SLP  solves  all  search  problems  in  the  class  NP.  Moreover,  we  do  this 
in  a  uniform  way  as  defined  in  [MT99].  Specifically,  we  show  that  there 
is  a  single  DATALOG"  program  PiYg  such  that  given  any  Turing  ma¬ 
chine  M,  any  polynomial  p  with  non- negative  integer  coefficients  and  any 
input  a  of  size  n  over  a  fixed  alphabet  i7,  there  is  an  extensional  database 
edbM,p,(r  such  that  there  is  a  one-to-one  correspondence  between  the  sta¬ 
ble  models  of  edhM,p,<T  U  Pvrg  and  the  accepting  computations  of  the 
machine  M  that  reach  the  final  state  in  at  most  p{n)  steps.  Moreover, 
edbM,p,a  can  be  computed  in  polynomial  time  from  p,  a  and  the  de¬ 
scription  of  M  and  the  decoding  of  such  accepting  computations  from 
its  corresponding  stable  model  of  edbM,p,<T  U  Prrg  can  be  computed  in 
linear  time.  A  similar  statement  holds  for  Default  Logic  with  respect  to 
-search  problems. 

We  also  show  that  there  is  single  program  Meta  which  is  a  metainter¬ 
preter  for  SLP  programs.  That  is,  for  any  program  Q,  there  there  is  an 
encoding  of  Q  as  an  extensional  data  base  edbq  such  that  the  stable 
models  of  Meta  U  edbq  are  in  one-to-one  correspondence  with  the  stable 
models  of  Q. 


1  Introduction 

The  main  motivation  for  this  paper  comes  from  recent  developments  in  Knowl¬ 
edge  Representation,  especially  the  appearance  of  a  new  generation  of  sys¬ 
tems  [CMT96,NS96,ELM"^97]  based  on  the  so-called  Answer  Set  Programming 
(ASP)  paradigm  [Nie98,CP98,MT99,Lif98].  In  particular,  these  systems  suggest 
that  we  need  to  revisit  one  of  the  basic  issues  in  the  foundations  of  ASP,  namely, 
how  can  we  characterize  what  such  ASP  systems  can  theoretically  compute. 
Throughout  this  paper,  we  shall  focus  mostly  on  one  particular  ASP  formalism, 
namely,  the  Stable  Semantics  for  Logic  Programs  (SLP)  [GL88].  We  note  that 
the  underlying  methods  of  ASP  are  similar  to  those  used  in  Logic  Program¬ 
ming  [Ap90]  and  Constraint  Programming  [JM94,MS99].  That  is,  like  Logic 
Programming,  ASP  is  a  declarative  formalism  and  the  semantics  of  all  ASP 
systems  are  based  on  logic.  Like  Constraint  Programming,  certain  clauses  of  an 
ASP  program  act  as  constraints.  There  is  a  fundamental  diflference  between  ASP 
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programs  and  Constraint  Logic  programs,  however.  That  is,  in  Constraint  Pro¬ 
gramming,  the  constraints  act  on  individual  elements  of  Herbrand  base  of  the 
program  while  the  constraint  clauses  in  ASP  programs  act  more  globally  in  that 
they  place  restrictions  on  what  subsets  of  the  Herbrand  base  can  be  acceptable 
answers  for  program.  For  example,  suppose  that  we  have  a  problem  II  whose 
solutions  are  subsets  of  some  Herbrand  base  H.  In  order  to  solve  the  problem, 
an  ASP  programmer  essentially  writes  a  logic  program  P  that  describes  the  con¬ 
straints  on  the  subsets  of  H  which  can  be  answers  to  11.  The  basic  idea  is  that  the 
program  P  should  have  the  property  that  there  is  an  easy  decoding  of  solutions 
of  n  from  stable  models  of  P  and  that  all  solutions  of  77  can  be  obtained  from 
stable  models  of  P  through  this  decoding.  The  program  P  is  then  submitted 
to  the  ASP  engine  such  as  smodels  [NS96],  dlv  [ELM+97]  or  DeReS  [CMT96] 
which  computes  the  stable  models  of  the  program  P.  Thus  the  ASP  engine  finds 
the  stable  models  of  the  program  (if  any  exists)  and  we  read-off  the  solutions 
to  77  from  these  stable  models.  Notice  that  the  idea  here  is  that  all  solutions 
are  equally  good  in  the  sense  that  any  solution  found  in  the  process  described 
above  is  acceptable.  Currently,  the  systems  based  on  ASP  paradigm  are  being 
tested  on  the  problems  related  to  planning,  product  configuration,  combinatorial 
optimization  problems  and  other  domains. 

It  is  a  well  known  fact  that  the  semantics  of  existing  Logic  Programming  sys¬ 
tems  such  as  Prolog,  Mercury  and  LDL  have  serious  problems.  For  instance,  the 
unification  algorithm  used  by  most  dialects  of  Prolog  do  not  enforce  the  occur 
check  and  hence  these  systems  can  produce  incorrect  results  [AP94].  Moreover, 
the  processing  strategies  of  Prolog  and  similar  languages  have  the  effect  that 
correct  logic  programs  can  be  non-terminating  [AP93].  While  good  program¬ 
ming  techniques  can  overcome  these  problems,  it  is  clear  that  such  deficiencies 
have  restricted  the  appeal  of  the  Logic  Programming  systems  for  ordinary  pro¬ 
grammers  and  system  analysts.  The  promise  of  ASP  and,  in  particular,  of  SLP 
and  its  extensions,  such  as  Disjunctive  Logic  Programming  [GL91, ELM+97],  is 
that  a  new  generation  of  logic  programming  systems  can  be  built  which  have  a 
clear  semantics  and  are  easier  to  program  than  the  previous  generation  of  Logic 
Programming  systems.  In  particular,  both  of  the  problems  referred  to  above, 
namely,  the  occurs  check  problem  and  the  termination  problem,  do  not  exist  in 
SLP.  Of  course,  there  is  a  price  to  pay,  namely,  SLP  systems  only  accept  pro¬ 
grams  without  function  symbols.  Consequently,  one  of  the  basic  data  structures 
used  in  Prolog,  specifically,  the  term,  is  not  available  in  SLP.  Thus  SLP  systems 
require  the  programmer  to  explicitly  construct  many  data  structures.  In  SLP 
programming,  predicates  are  used  to  construct  the  required  data  structures  and 
clauses  that  serve  as  constraints  are  used  to  ensure  that  the  predicates  behave 
properly  with  respect  to  semantics  of  the  program.  SLP  programs  are  always 
terminating  because  the  Herbrand  base  is  finite  and  hence  there  are  only  a  finite 
number  of  stable  models.  In  addition,  unlike  the  case  of  usual  Logic  Program- 
ming,  the  order  of  the  clauses  of  the  program  does  not  affect  the  set  of  stable 
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models  of  the  program^.  Finally  the  stable  semantics  of  logic  programs  is  well 
understood  so  that  SLP  programs  have  clear  semantics. 

We  note  that  the  restriction  that  ASP  programs  do  not  allow  function  sym¬ 
bols  is  crucial.  First,  it  is  well  known  that  once  one  allows  function  symbols  in 
a  logic  program  P,  the  Herbrand  base  becomes  infinite.  Moreover,  the  stable 
models  of  logic  programs  with  function  symbols  can  be  immensely  complex.  For 
example,  for  stratified  logic  programs  [ABW88,Prz88],  the  perfect  model  is  the 
unique  stable  model  of  that  program  [GL88].  Apt  and  Blair  [AB90]  showed  that 
perfect  models  of  stratified  logic  programs  capture  precisely  the  arithmetic  sets. 
That  is,  they  show  that  for  a  given  arithmetic  set  X  of  natural  numbers,  there  is 
a  finite  stratified  logic  program  Px  such  that  in  the  perfect  model  of  Px,  some 
predicate  px  is  satisfied  by  precisely  the  numbers  in  X.  This  was  the  first  result 
that  showed  that  it  is  not  possible  to  have  meaningful  practical  programming 
with  general  stratified  programs  if  we  allow  unlimited  use  of  function  symbols. 
The  result  of  [AB90]  was  extended  in  [BMS95]  where  Blair,  Marek,  and  Schlipf 
showed  that  the  set  of  stable  models  of  a  locally  stratified  program  can  capture 
any  set  in  the  hyper  arithmetic  hierarchy.  Marek,  Nerode,  and  Remmel  [MNR94] 
showed  that  the  problem  of  finding  a  stable  model  of  a  finite  (predicate)  logic 
program  P  is  essentially  equivalent  to  finding  a  path  through  an  infinite  branch¬ 
ing  recursive  tree.  That  is,  given  an  infinite  branching  recursive  tree  T  C 
there  is  a  finite  program  Pt  such  that  there  is  a  one-to-one  degree  preserving  cor¬ 
respondence  between  the  infinite  paths  through  T  and  the  stable  models  of  Pt 
and,  vice  versa,  given  an  finite  program  P,  there  is  a  recursive  tree  Tp  such  that 
there  is  one-to-one  degree  preserving  correspondence  between  the  stable  models 
of  P  and  the  infinite  paths  through  Tp.  One  consequence  of  this  result  is  that 
the  problem  of  determining  whether  a  finite  predicate  logic  program  has  a  stable 
model  is  a  Z’f-complete.  More  results  on  the  structure  of  the  family  of  stable 
models  of  the  programs  can  be  found  in  [CR99]. 

All  the  results  mentioned  in  the  previous  paragraph  show  that  stable  seman¬ 
tics  for  logic  programs  admitting  function  symbols  can  be  used  only  in  a  very 
limited  setting.  This  is  precisely  what  the  XSB  system  attempts  to  do.  When 
well-founded  semantics  is  total,  the  resulting  model  is  the  unique  stable  model 
of  the  program.  XSB  attempts  to  query  that  model.  Unfortunately,  the  class  of 
programs  for  which  it  succeeds  is  not  intuitive  [RRS'^97].  ASP  systems  propose 
a  more  radical  solution  to  the  problem  of  complexity  of  stable  models  of  logic 
programs  with  function  symbols,  namely,  abandoning  function  symbols  entirely. 
Once  this  is  accepted,  the  semantics  of  logic  program  P  can  be  defined  in  two 
stages.  First,  we  assume,  as  in  standard  Logic  Programming,  that  we  interpret  P 
over  the  Herbrand  universe  of  P  determined  by  the  predicates  and  constants 
that  occur  in  P.  Since,  the  set  of  constants  occurring  in  the  he  program  is  finite, 
we  can  grounded  the  program  in  these  constants  to  obtain  a  finite  propositional 
logic  program  Pg.  The  stable  models  of  P  are  by  definition  the  stable  models  Pg. 
The  process  of  grounding  is  performed  by  a  separate  grounding  engine  such  as 

^  However  it  is  the  case  that  the  order  of  the  clauses  can  affects  the  processing  time 
of  the  ASP  engine. 
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Iparse  [NS96] .  The  grounded  program  is  then  passed  to  the  engine  computing  the 
stable  models.  It  is  then  easy  to  check  that  the  features  of  SLP  mentioned  above, 
i.e.,  the  absence  of  occurs  check  and  termination  problems  and  the  independence 
of  the  semantics  from  the  ordering  of  the  clauses  of  the  program,  automatically 
hold. 

The  language  of  logic  programming  without  function  symbols  was  studied  by 
the  database  community  with  the  hope  that  it  could  lead  to  new,  more  powerful, 
database  language  [U1188].  This  language  is  called  DATALOG  and  some  database 
systems  such  as  DB2  implement  the  positive  part  of  DATALOG.  The  fact  that 
admitting  negation  in  the  bodies  of  clauses  leads  to  multiple  stable  models  was 
unacceptable  from  the  database  perspective.  Hence  the  database  community 
preferred  other  semantics  for  DATALOG  program  with  negation  such  as  the 
well-founded  semantics  [VRS91]  or  the  inflationary  semantics  [AHV95]. 

The  main  purpose  of  this  paper  is  to  revisit  the  question  of  what  can  be 
computed  by  logic  programs  without  functions  symbols  under  the  stable  model 
semantics.  First,  consider  the  case  of  finite  propositional  programs.  Here  the 
situation  is  simple.  Given  a  set  At  of  propositional  atoms,  let  .F  be  a  finite 
antichain  of  subsets  of  At  (i.e.  whenever  X,Y  £  X  C  F,  then  X  Y). 
Then  one  can  show  that  there  is  a  logic  program  Pj:  such  that  P  is  precisely  the 
class  of  all  stable  models  of  Pjf  [MT93].  Moreover,  the  family  of  stable  models  of 
any  program  P  forms  such  an  antichain.  Thus  in  the  case  of  finite  propositional 
logic  programs,  we  have  a  complete  characterization  of  the  possible  sets  of  stable 
models.  Note,  however,  this  result  does  not  tell  us  anything  about  the  uniformity 
and  the  effectiveness  of  the  construction.  The  basic  complexity  result  for  SLP 
propositional  programs  is  due  to  Marek  and  Truszczyhski  [MT91]  who  showed 
that  the  problem  of  deciding  whether  a  finite  propositional  logic  program  has  a 
stable  model  is  AP-complete. 

To  formulate  our  question  about  what  can  be  computed  by  logic  programs 
without  functions  symbols  under  the  stable  model  semantics,  we  first  recall  the 
notion  of  search  problem  [GJ79]  and  of  a  uniform  logic  program  [MT99].  A  search 
problem  is  a  set  S  of  finite  instances  [GJ79]  such  that,  given  any  instance  /  G  5, 
we  have  a  set  Sj  of  solutions  to  S  for  instance  I .  For  example,  the  search  problem 
may  be  to  find  Hamiltonian  paths  in  a  graph.  Thus,  the  set  of  instances  of  the 
problem  is  the  set  of  all  finite  graphs  and,  for  any  given  instance  /,  Si  is  the 
set  of  all  Hamiltonian  paths  of  I.  An  algorithm  solves  the  search  problem  S 
if  it  returns  a  solution  s  e  Sj  whenever  Sj  is  non-empty  and  it  returns  the 
string  “empty”  otherwise.  We  say  that  a  search  problem  S  is  in  NP  if  there  is 
such  an  algorithm  which  can  be  computed  by  a  non-deterministic  polynomial 
time  Turing  machine.  We  say  that  search  problem  S  is  solved  by  a  uniform  logic 
program  if  there  exists: 

1.  a  polynomial  time  encoding  edbs  under  which  every  instance  7  of  5  is  mapped 
to  a  finite  set  of  facts,  i.e.  clauses  with  empty  bodies  and  no  variables,  and 

2.  a  single  logic  program  Ps  such  that  there  is  a  polynomial  time  computable 
function  sols{’^  •)  such  that  for  every  instance  I  of  «S,  sols{I^  •)  maps  the  set  of 
stable  models  of  the  edbs{I)  U  P  onto  the  set  of  solutions  Sj  of  7. 
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We  note  that  decision  problems  can  be  viewed  as  special  cases  of  search 
problems.  Schlipf  [Sch95]  has  shown  that  the  class  of  decision  problems  in  NP 
is  captured  precisely  by  uniform  logic  programs.  Specifically  he  proved  that  a 
decision  problem  is  solved  by  a  uniform  logic  program  if  and  only  if  it  is  in 
NP.  An  excellent  review  of  the  complexity  and  expressivity  results  for  Logic 
programming  can  be  found  in  [DEGV99]. 

The  goal  of  this  paper  is  to  prove  a  strengthening  of  Schlipf ’s  result  as  well 
as  prove  a  number  of  related  facts.  First,  we  will  prove  that  the  Schlipf ’s  result 
can  be  extended  to  all  NP  search  problems.  That  is,  we  shall  show  that  there 
is  a  single  logic  program  Pttq  that  is  capable  of  simulating  polynomial  time 
nondeterministic  Turing  machines  in  the  sense  that  given  any  polynomial  time 
nondeterministic  Turing  machine  M,  any  input  cr,  and  any  run-time  polynomial 
p(a;),  there  is  a  set  of  facts  edbM,p,a  (depending  on  Af ,  p{x)  and  cr)  such  that  a 
stable  model  of  Pxrg  G  edbM,p,<r  codes  an  accepting  computation  of  M  started 
with  input  cr  that  terminates  inp(|<j|)  steps  and  any  such  accepting  computation 
of  M  is  coded  by  some  stable  model  of  Pttq  U  edbu^p^a-  This  result  will  show 
that  logic  programs  without  function  symbols  under  the  stable  logic  semantics 
capture  all  iVP-search  problems^.  The  converse  implication,  that  is,  a  search 
problem  computed  by  a  uniform  logic  program  P  is  an  iVP-search  problem  is 
obvious  since  one  can  compute  a  stable  model  of  a  program  by  first  guessing 
SM  and  then  doing  a  polynomial  time  check  that  is  a  stable  model  of  the 
program. 

2  Technical  Preliminaries 

In  this  section  we  formally  introduce  several  notions  that  will  be  needed  for  the 
proof  of  our  main  result.  Our  proof  of  this  result  uses  essentially  the  same  idea 
used  by  Cook  [Co71]  in  his  proof  of  the  iVP-completeness  of  the  satisfiability 
problem. 

First,  we  introduce  the  set  of  logic  programs  that  we  will  study.  We  will 
consider  here  only  so  called  DATALOG~'  programs.  Specifically,  a  clause  is  an 
expression  of  the  form 

p{X)  ^  91(A), . . . ,  qm(X),  -  ri(A), . , . ,  -  Tn{X)  (1) 

where  p,  91, . . . ,  •  •  • ,  7*71  are  atoms,  possibly  with  variables  and/or  con¬ 

stants.  A  program  is  a  finite  set  P  of  clauses  of  the  form  (1).  Each  program  de¬ 
termines  its  language  (based  on  the  predicates  occurring  in  the  program).  Since 
there  are  no  function  symbols  in  our  programs,  both  the  Herbrand  universe  and 
the  Herbrand  base  of  the  program  are  finite. 

^  As  pointed  by  M.  Truszczynski,  for  our  goal  of  describing  the  complexity  of  the 
Stable  Logic  Programming,  a  weaker  result  is  sufficient.  That  is,  we  need  only  show 
that  for  each  instance  I  of  an  NP  search  problem  77,  there  is  a  program  Pj  and  a 
polynomial  time  projection  from  the  collection  of  stable  models  of  Pi  to  the  set  of 
solutions  of  7.  Our  result  shows  that  this  property  holds  in  a  stronger  form,  namely, 
there  is  a  single  program  with  a  varying  extensional  database. 
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A  ground  instance  of  the  clause  C  of  the  form  (1)  is  the  result  of  a  simultane¬ 
ous  substitution  of  constants  for  variables  occurring  in  C.  Given  a  program  P,  Pg 
is  the  propositional  program  consisting  of  all  ground  substitutions  of  clauses  of  P. 

Given  a  propositional  program  P  and  a  set  M  included  in  its  Herbrand 
base,  Pp,  the  Gelfond-Lifschitz  transformation  of  P  by  means  of  M,  GL(P,  M) 
is  the  program  GP(P,  M)  arising  from  P  as  follows.  First,  eliminate  all  clauses  C 
in  P  such  that  for  some  i,  1  <  j  <  n,  rj  6  M.  Finally,  in  any  remaining  clauses, 
we  eliminate  all  negated  atoms.  The  resulting  set  of  clauses  forms  a  program, 
GP(P,  M),  which  is  a  Horn  program  and  hence  it  possesses  a  least  model  Nm- 
We  say  that  M  is  a  stable  model  of  the  propositional  program  P  if  M  —  Nm- 
Finally,  we  say  that  M  is  a  stable  model  of  a  program  P  (now  possibly  with 
variables),  if  M  is  a  stable  model  of  the  propositional  program  Pg. 

A  (nondeterministic)  Turing  Machine  is  a  structure  of  the  form 

M  =  (Q,  DjSjS,  /), 

where  Q  is  a  finite  set  of  states  and  JC  is  a  finite  alphabet  of  input  symbols. 
We  assume  Q  always  contains  two  special  states,  sq?  the  start  state,  and  /, 
the  final  state.  We  assume  that  there  is  special  symbol  B  for  “blank”  such 
that  B  ^  E.  The  set  P  ==  Z"  U  {P}  is  the  set  of  tape  symbols.  The  set  D  of 
move  directions  will  consist  of  elements  r,  and  A  where  I  is  the  “move  left” 
symbol,  r  is  the  “move  right”  symbol  and  A  is  the  “stay  put”  symbol.  The 
function  8  '.QxP  P((5  x  P  x  P)  is  the  transition  function  of  the  machine  M. 
We  can  think  of  (5  as  a  5-ary  relation.  We  assume  M  operates  on  a  one-way  infinite 
tape  where  the  cells  of  the  tape  are  labeled  from  left  to  right  by  0, 1, 2, . . ..  To 
visualize  the  behavior  of  the  machine  M,  we  shall  talk  about  the  read- write  head 
of  the  machine.  At  any  given  time,  in  a  computation,  the  read- write  head  of  M 
is  always  in  some  state  s  G  Q  and  is  reading  some  symbol  p  e  P.  It  then  picks 
an  instruction  (si,  pi,  d)  E  8{s^p)  and  then  replaces  the  symbol  p  by  pi,  changes 
its  state  to  state  si,  and  moves  according  to  d. 

Suppose  we  are  given  a  Turing  machine  M  whose  runtimes  are  bounded  by 

a  polynomial  p(x)  =  ao  +  aix  H - f-  akX^  where  each  Oi  £  N  —  {0, 1,2,.. .} 

and  ak  0.  That  is,  on  any  input  of  size  n,  an  accepting  computation  terminates 
in  at  most  p(n)  steps.  Then  any  accepting  computation  on  input  a  can  affect  at 
most  the  first  p(n)  cells  of  the  tape.  Thus  in  such  a  situation,  there  is  no  loss  in 
only  considering  tapes  of  length  p(n).  Hence  in  what  follows,  one  shall  implicitly 
assume  that  that  the  tape  is  finite.  Moreover,  it  will  be  convenient  to  modify  the 
standard  operation  of  M  in  the  following  ways. 

1.  We  shall  assume  8{f,  a)  =  {(/,  a.  A)}  for  all  a  e  P. 

2.  Given  an  input  x  of  length  n,  instead  of  immediately  halting  when  we  first  get 
to  the  final  state  /  reading  a  symbol  a,  we  just  keep  executing  the  instruction 
(/,  a.  A)  until  we  have  completed  p(n)  steps.  That  is,  we  remain  in  state  /,  we 
never  move,  and  we  never  change  any  symbols  on  the  tape  after  we  get  to  state  /. 

The  main  effect  of  these  modifications  is  that  all  accepting  computations  will 
run  for  exactly  p(n)  steps  on  an  input  of  size  n. 
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3  Uniform  Coding  of  Turing  Machines 
by  a  Logic  Program 

In  this  section,  we  shall  describe  the  logic  program  Pxrg  and  our  extensional 
data  base  coding  edhM,p,a  described  above.  The  key  to  our  construction  is  the 
fact  that  at  any  given  moment  of  time,  the  behavior  of  a  Turing  machine  M 
depends  only  on  the  current  state  of  tape,  the  position  of  the  read-write  head 
and  the  set  of  available  instructions.  Our  coding  of  Turing  machine  computation 
will  reflect  this  observation. 

First,  we  define  the  language  (i.e.  a  signature)  of  the  program  Pttq^  The  set  of 
predicates  that  will  occur  in  our  extensional  database  are  the  following:  iime{X) 
for  “X  is  a  time  step”,  cell{X)  for  “X  is  a  cell  number”,  symh{X)  for  “X  is  a 
symbol”,  state{S)  for  “5  is  a  state”,  ijposition{P)  for  “P  is  the  initial  position 
of  the  read- write  head” ,  data{P^  Q)  for  ‘initially,  the  tape  stores  the  symbol  Q 
at  the  cell  P” ,  delta{X,  Y,  XI,  Yl,  Z)  for  “the  triple  (XI,  Yl,  Z)  is  an  executable 
instruction  when  the  read- write  head  is  in  state  X  and  reads  symbols  Y”  (thus 
delta  represents  the  five-place  relation  (^),  neg(X,  Y)  for  “X  ^  Y”  succ(X,  Y) 
for  Y  =  X-M. 

Next  we  describe  the  constants  that  will  be  used  in  our  description  of  time, 
cell  numbers,  cell  contents  and  specific  machines.  The  last  two  families  of  con¬ 
stants  will  be  “machine- dependent” ,  since  we  did  not  specify  any  restrictions  on 
the  finite  sets  Q  and  X.  Thus  we  have  the  following  set  of  constant  symbols:  (1) 
0, 1, . . .  ,p(n)  where  n  is  the  length  of  the  input  a  and  p  is  the  runtime  polyno¬ 
mial,  (2)  5,  for  each  s  G  5.  Note  two  constants  sq  (for  initial  state),  and  /  (for 
final  state)  will  be  present  in  every  extensional  database.  (3)  x  for  each  x  G  X, 
and  B  (blank  symbol),  and  finally  (4)  r, Z,  A. 

This  given,  we  can  easily  define  the  extensional  database  extM,p,<r>  That  is, 
given  input  a  =  cri . .  .(Jn,  runtime  polynomial  p(rc),  we  let  edbM,<T,p  consist  of 
the  following  set  of  facts  that  describe  the  machine  M,  the  segment  of  integers 
0, . . .  ,p(n)  and  the  initial  configuration  of  the  tape. 

1.  state{s)  <—  for  s  G  Q 

2.  symb(x)  <r~  ior  x  £  P 

3.  delta{s,x,sl,xl,d)  ^  for  every  pair  (s,a;)  G  Q  x  P  and  every  triple 
(5l,a;l,d)  G  S{s,x) 

4.  $ucc{i,  z  -h  1)  for  0  <  i  <  p{n). 

5.  <—  for  0  <  z  <  p{n) 

6.  cell{i)  ^  for  0  <  z  <  p(ti)  —  1. 

7.  data{m^  ^(^))  for  0  <  m  <  |a|  —  1 

8.  data{m^  B)  for  \(7\  <m  <  p{n)  —  1 

9.  dzr(Z),  dir{r),  dir{X) 

10.  i-position{0) 

11.  neq{a,  b)  for  all  a,  &  G  5  U  P  U  {0, . . .  ,p(n)}  with  a^b. 


^  Technically,  we  should  use  a  separate  inequality  relation  for  each  type,  but  we  will 
not  use  different  symbols  for  these  inequality  relations. 
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The  remaining  predicates  of  Brrg  are  the  following:  tape{P,  Q,  T)  for  “the 
tape  stores  symbol  Q  at  cell  P  at  time  T”,  position{P,T)  for  “the  read- 
write  head  reads  the  content  cell  P  at  time  T\  state{S,T)  for  “the  read- 
write  head  is  in  state  S  at  time  T”  (notice  that  we  have  both  a  unary 
predicate  state /I  with  the  content  consisting  of  states,  and  state /2  to  de¬ 
scribe  the  evolution  of  the  machine),  instr{S,Q,Sl,Ql,D,T)  for  “Instruction 
(5l,Ql,D)  belonging  to  S(S^Q)  has  been  selected  for  execution  at  time  T”, 
other Instr{S,  Q,  51,  Ql,  D,  T)  for  “Instruction  other  than  (51,  Ql^D)  belonging 
to  5{S^Q)  has  been  selected  for  execution  at  time  T”,  instr,def{T)  for  “there  is 
an  instruction  to  be  executed  at  time  T” ,  completion  for  “computation  success¬ 
fully  completed”,  and  A,  a  propositional  letter'^. 

In  the  program  Pxrg,  there  should  be  no  constants.  We  will  not  be  abso¬ 
lutely  strict  in  this  respect.  For  ease  of  presentation,  we  will  use  the  constants 
0,  /,  and  So-  These  can  easily  be  eliminated  by  introducing  appropriate  unary 
predicates.  Also  we  shall  write  y  =  x  Al  for  succ{x^y).  Finally  to  simplify  the 
clauses,  we  will  follow  here  the  notation  used  in  the  smodels  syntax.  That  is,  we 
will  use  p{Xi] . . . ,  Xk)  as  an  abbreviation  for  p(A'i), . . .  ^p{Xk). 

This  given,  we  are  now  ready  to  write  the  program  PTng- 

Group  1.  Our  first  four  clauses  are  used  to  describe  the  position  of  the  read- 
write  head  at  any  given  time  t. 

(1.1)  position{P,T)  =  Oyi-position{P) 

(1.2)  position(P,Tl)  <r-  T1  —  T  A  1,  position  (PI,  T),  state{S,T), 
tape{Pl,  Q,  T),  instr(5,  Q,  51,  Ql,  D,T),D  =  l,  neg(Pl,  0),  PI  = 
P+l 

(1.3)  position{P,Tl)  <—  Tl  =  T  -h  1,  position  (PI,  T),  state(5,T), 
tape(Pl,  Q,  T),  instr(5,  Q,  51,  Ql,  P,  T),  P  =  r,  P  =  PI  -f  1 

(1.4)  position(P,Tl)  ^  T1  =  r  +  l,position(P,  T),  stote(5,T), 
tape(Pl,  Q,  T),  instr(5,  Q,  51,  Ql,  P, T),  P  =  A 

Group  2.  Our  next  three  clauses  describe  how  the  contents  of  the  tape  change 
as  instructions  get  executed. 

(2.1)  tape(P,  Q,  T)  ^  T  -  0,  data{P,  Q) 

(2.2)  tape(P,Ql,Tl)  -t-  T1  =  T  -f  l,position(P,r),  state(5,T), 
tope(P,  Q,  T),  instr(5,  Q,  51,  Ql,  P,  T) 

(2.3)  tape(P,Q,ri)  ^  T1  =  T -i-  1,  tape(P,Q,r),  position(Pl,r), 
neg(P,  PI) 

Group  3.  Our  next  two  clauses  describe  how  the  state  of  the  read- write  head 
evolves  in  time. 

(3.1)  state(5,  T)  ^  T  =  0, 5  =  sq 

(3.2)  state(5,Tl)  •<—  Tl  =  T  -h  1, position (P, T),  state(51,r), 
tape(P,  Q,  T),  instr(51,  Q,  5,  Ql,  P,  T) 

^  The  propositional  letter  A  will  be  used  whenever  we  write  clauses  acting  as  con¬ 
straints.  That  is,  the  symbol  A  will  occur  in  the  following  syntactical  configuration.  A 
will  be  the  head  of  some  clause,  and  the  negation  of  A  will  also  occur  in  the  body 
of  that  same  clause.  In  such  situation  a  stable  model  cannot  satisfy  the  remaining 
atoms  in  the  body  of  that  clause. 
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Group  4.  Our  next  two  clauses  describe  how  we  select  a  unique  instruction  to 
be  executed  at  time  T. 

(4.1)  Selecting  instruction  at  step  0. 

instr{S,  Q,  Sl,Ql,  D,  T)  ^  state{S]  51),  symb{Q;  Ql),  dir{D), 
time{T)^  T  =  0,  S  =  so^ijpositio'n(P)ytape{P,Q,T)y 
delta{S,  0, 51,  Ql,  D),  -mother Instr(S,  Q,  51,  Ql,  D,  T) 

(4.2)  Selecting  instruction  at  other  steps. 

instr{S,  Q,  51,  Ql,  D,  T)  <—  state{S\  Sl)ySymb{Q;  Ql), 
dir{D),time{T),positi(m{P,  T),  state{S,  T),tape{P,  Q,  T), 
delta{S,  Q,  51,  Ql,  D),  ->otherInstr{S,  Q,  51,  Ql,  £),  T) 

Group  5.  Our  next  set  of  clauses  define  the  other Instr  predicate  and  (5.6)  and 
(5.7)  ensure  that  exactly  one  instruction  is  selected  for  execution  at 
any  given  time  T. 

(5.1)  otherInstr{S,  Q,  51,  Ql,  Z)l,  T)  state{S;  S';  51;  52), 
symb{Q;  Q';  Ql;  Q2),  tzme(T),  dir{D;  D2), 
instr{S',  Q',  52,  Q2,  D2,  T),  neq{S2, 51) 

(5.2)  otherInstr{S,Q,Sl,Ql,Dl,T)^  statelS;S';Sl;S2), 
symb{Q;  Q';  Ql;  Q2),  time{T),  dir{D;  D2), 
instr{S'^  Q',  52,  Q2,  D2,  T),  neq{Q2,  Ql) 

(5.3)  otherInstr{S,  Q,  51,  Ql,  Ill,  T)  <—  state{S;  S';  51;  52), 
symb{Q;Q';Ql;Q2)^  time{T),  dir{D;D2), 
mstr(5',  Q',  52,  Q2,  L>2,  T),  neq{D2,  Dl) 

(5.4)  otherInstr{S^  Q,  51,  Ql,  Dl,  T)  <—  state{S;  S';  51;  52), 

52/m6(Q;  Q';Q1;Q2),  time{T),  dir{D;D2), 
instr{S'j  Q',  52,  Q2,  D2,  T),  neg(5',  5) 

(5.5)  otherInstr{S,  Q,  51,  Ql,  Dl,  T)  state{S;  S';  51;  52), 
symb{Q;  Q';  Ql;  Q2),  time{T)^  dir{D;  D2), 
instr{S'^  Q',  52,  Q2,  D2,  T)^neq{Q',  Q) 

(5.6)  The  definition  of  the  instr  Ae f  predicate. 

instr Aef{T)  state{S;Sl),symb{Q;Ql),dir{D)Ai'rne{T), 
msir(5,Q,51,Ql,D,T) 

(5.7)  The  clause  to  ensure  that  there  is  an  instruction  to  be  executed 
at  any  given  time. 

A  <r-  time{T),  -^instrAef{T),  -^A 

Group  6.  Constraints  for  the  coherence  of  the  computation  process. 

(6.1)  When  the  task  is  completed. 
completion  <—  instr{f^  Q,  /,  Q,  A,p(n)) 

(6.2)  The  atom  completion  belongs  to  every  stable  model. 

A  <r-  -n  completion,  -^A 

4  Main  Results 

Our  first  proposition  immediately  follows  from  our  construction. 

Proposition  1.  There  is  a  polynomial  q  so  that  for  every  machine  M,  polyno¬ 
mial  p,  and  an  input  a,  the  size  of  the  extensional  database  edbM pa  is  equal  to 
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In  the  full  version  of  the  paper,  we  shall  prove  that  for  any  nondeterministic 
Turing  Machine  M,  runtime  polynomial  p(a;),  and  input  a  of  length  n,  the  stable 
models  of  edbM,p,cr  U  Pxrg  encode  the  sequences  of  tapes  of  length  p{n)  which 
occur  in  the  steps  of  an  accepting  computation  of  M  starting  on  a  and  that  any 
such  sequence  of  steps  can  be  used  to  produce  a  stable  model  of  edbM,p,a-^PTrg> 

Theorem  1.  The  mapping  of  Turing  machines  to  DATALOG''  programs  de¬ 
fined  by  M  ^dbM,p,<T  U  Pxng  has  the  property  that  there  is  a  1-1  polynomial 
time  correspondence  between  the  set  of  stable  models  of  edbu.p.a  U  Pxng  O'nd  the 
set  of  computations  of  M  of  the  length  p{n)  ending  in  the  state  f. 

Corollary  1.  A  search  S  problem  can  be  solved  by  means  of  a  uniform  logic 
program  in  SLP  if  and  only  if  S  is  an  NP -search  problem. 

One  can  also  show  that  all  supported  models  of  edbM,p,a  U  Pxng  are  stable. 
This  fact  implies  that  the  similar  corollary  holds  for  Supported  Logic  Program¬ 
ming,  SuLP. 

Corollary  2.  A  search  S  problem  can  be  solved  by  means  of  a  uniform  logic 
program  in  SuLP  if  and  only  if  S  is  an  NP -search  problem. 

Finally  we  can  prove  similar  results  for  default  logic  programs  without  func¬ 
tion  symbols  with  respect  to  nondeterministic  Turing  machines  with  an  oracle 
for  d-SAT.  It  thus  follows  that  a  search  problem  S  can  be  solved  by  means  of 
a  uniform  default  logic  program  if  and  only  if  S  is  in  .  A  decision  version  of 
this  result  has  been  proved  in  [CEG97]. 

Theorem  2.  For  each  n  E  N  there  is  a  default  theory  {WmDn)  such  that  for 
every  3-SAT  oracle  Turing  machine  M,  every  polynomial  p  €  and  every 

finite  input  cr  where  |cr|  =  n,  there  is  a  polynomial-time  one-to-one  correspon¬ 
dence  between  the  accepting  computations  of  length  p{n)  of  M  on  input  a  and 
the  Reiter  extensions  of  the  default  theory  {edbM,p,a-  U  Dn)- 

5  Metainterpreters 

The  results  of  section  4  suggest  that  there  should  be  a  universal  logic  pro¬ 
gram  PMeta  for  the  stable  model  semantics  in  the  sense  that  for  any  logic  pro¬ 
gram  Q,  there  exists  an  extensional  database  edbq  describing  Q  such  that  there 
is  a  one-to-one  correspondence  between  the  stable  models  of  Pm  eta  U  edbq  and 
the  stable  models  of  Q.  We  call  such  a  program  program  a  metainterpreter  for 
SLP  programs. 

First,  we  will  describe  a  metainterpreter  for  the  class  of  so-called  0-2  pro¬ 
grams.  A  propositional  program  P  is  a  0-2  program  if  for  every  clause  C  of  P 
has  either  no  positive  literal  in  the  body,  or  exactly  2  positive  literals  in  the 
body.  Blair  proved  that  0-2  programs  semi-represent  all  propositional  programs 
(see  [MT93],  Ch.  5,  for  the  discussion  of  semirepresentability).  The  following 
result  is  due  to  Blair. 
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Proposition  2  (Blair).  There  is  a  linear-lime  computable  function  f  that  as¬ 
signs  to  each  propositional  program  P,  a  0-2  program  f{P)  such  that  there  is  a 
one-to-one  projection  from  the  family  of  stable  models  of  f{P)  to  the  family  of 
stable  models  of  P. 

We  will  describe  a  metainterpreter  (which  we  will  call  Metal )  that  computes 
stable  models  of  0-2  propositional  programs.  To  this  end  we  need  a  data  structure 
that  expresses  the  given  0-2  program.  The  extensional  predicates  describing  the 
input  program  are  as  follows:  atom(-),  to  describe  atoms,  clause^-),  to  describe 
clauses,  head{',  •)  to  describe  the  head  of  a  clause,  neg{-^  •)  to  state  that  an  atom 
occurs  negatively  in  the  body  of  a  clause,  first •)  to  state  that  an  atom  is  the 
first  of  two  positive  atoms  occurring  in  the  body  of  a  clause,  and  second (•,  •)  to 
state  that  an  atom  is  the  second  of  two  positive  atoms  occurring  in  the  body  of 
a  clause. 

The  description  of  a  propositional  program  Q  (the  data  for  the  program 
Metal  consists  of  the  following  facts:  atom{a)  ^  for  all  atoms  a  occurring 
in  Q,clause{c)  <—  for  all  clauses  c  of  Q,  head  {a,  c)  whenever  a  is  the  head 
of  clause  c  in  Q,  first  {a,  c)  ^  and  secondfb^  c)  whenever  a  and  b  are  the  first 
and  the  second  atoms  in  the  body  of  clause  c  in  Q,  respectively.  We  call  this 
collection  edbQ. 

The  remaining  predicates  of  Metal  are  the  following:  nempty{'),  to  describe 
that  there  are  atoms  occurring  positively  in  the  body  of  a  clause,  empty {•)^  to  de¬ 
scribe  that  there  are  no  atoms  occurring  positively  in  the  body  of  a  clause,  m('), 
to  describe  the  stable  model  of  the  input  program  itself,  out(-),  to  describe  the 
complement  of  the  stable  model  of  the  input  program,  unusable{-)^  to  describe 
the  clauses  not  involved  in  the  computation  of  the  stable  model,  usable{-),  to  de¬ 
scribe  the  clauses  involved  in  the  computation  of  the  stable  model,  computed{’), 
to  describe  the  computed  atoms,  and  v4,  a  propositional  atom. 

This  given.  Metal  consists  of  the  following  clauses. 

1.  Generating  the  model. 

(a)  in{B)  atom{B),  ~^out{B) 

(b)  out{B)  atom{B),-iinlB) 

2.  Computing  Gelfond-Lifschitz  reduct. 

(a)  unusable(C)  clause{C),  atom{B)^  neg(B,  C),  in{B) 

(b)  usable{C)  clause{C),-^unusable{C) 

3.  Classifying  clauses. 

(a)  nempty{C)  clause{C)y  atom(B)jfirst{B^C) 

(b)  empty(C)  clause(C)j~>nempty{C) 

4.  Computation  process. 

(a)  computed{B)  <—  clause{C),  empty (C),  usable{C)^  head{B,  C) 

(b)  computed(B)  clause{C),first{BlyC),second{B2^C), 
computed{Bl)y  computed{B2)^  head{B^  C) 

5.  Constraints. 

(a)  A  atom{B),  in{B)^~^computed{B),  ->A 

(b)  A  atom{B)^  out{B),  computed{B),  -^A 

We  then  can  prove  the  following. 
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Proposition  3.  There  is  a  one-to-one  projection  that  for  every  propositional 
program  Q  maps  stable  models  of  Metal  Uedbq  to  stable  models  of  Q, 

In  the  full  version  of  this  paper,  we  construct  yet  another  metainterpreter  Meta2 
for  SLP,  that  accepts  all  propositional  programs  (not  only  0  —  2-programs)  and 
have  the  property  that  its  supported  models  are  automatically  stable.  The  size 
of  the  representation  of  the  extensional  database  is,  however,  larger.  A  num¬ 
ber  of  metainterpreters  for  various  classes  of  programs  have  been  constructed 
in  [EFLPOl], 
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Abstract.  We  investigate  in  this  paper  the  relationship  between  an  am¬ 
biguity  propagating  defeasible  logic  recently  proposed  by  Antoniou  et 
al.  [3]  and  well-founded  semantics  with  priorities  [6]  under  a  straight¬ 
forward  translation  from  defeasible  theories  to  extended  logic  programs. 
It  turns  out  that  a  slightly  restricted  version  of  defeasible  logic  is  cor¬ 
rect  wrt  well-founded  semantics  yet  incomplete.  We  also  investigate  the 
sources  of  the  incompleteness  and  argue  that  the  additional  conclusions 
obtained  by  prioritized  well-founded  semantics  are  indeed  desired. 


1  Introduction 

Defeasible  Logic  was  originally  proposed  by  Donald  Nute  in  1987  [13]  (for  an 
overview  see  the  handbook  article  [14]).  The  logic  was  never  as  prominent  as, 
say,  default  logic  [17]  or  circumscription  [11].  Yet  it  has  received  considerable 
attention  in  recent  years.  This  is  at  least  partly  due  to  a  very  active  group  of 
researchers  at  Griffith  University  which  has  worked  on  theoretical  foundations, 
further  development  and  implementations  of  defeasible  logic(s)  [1,2,3,12]. 

The  main  advantage  of  defeasible  logic  is  certainly  computational:  the  com¬ 
putation  of  conclusions  is  polynomial  and  highly  efficient  implementations  ex¬ 
ist  [12].  A  second  advantage  are  its  built  in  preference  handling  facilities. 

In  the  meantime  several  variants  of  Defeasible  Logic  have  been  proposed. 
All  of  them  are  defined  proof  theoretically.  Defeasible  logic (s)  belong  to  a  class 
of  nonmonotonic  approaches  which  can  be  called  directly  sceptical.  By  this  we 
mean  sceptical  approaches  where  the  conclusions,  rather  than  being  defined  as 
the  intersection  of  extensions  or  answer  sets,  are  constructed  directly. 

In  the  area  of  logic  programming  well-founded  semantics  can  be  viewed  as 
a  directly  sceptical  semantics.  It  is  interesting  to  see,  therefore,  what  the  ex¬ 
act  relationship  between  these  two  approaches  is.  Since  the  preference  handling 
techniques  of  defeasible  logic  have  no  counterpart  in  standard  well-founded  se¬ 
mantics  the  comparison  will  be  based  on  an  extension  of  well-founded  semantics 
which  was  recently  proposed  by  the  author  of  this  paper. ^ 

^  Although  numerous  prioritized  version  of  logic  programs  under  stable  model  or  an¬ 
swer  set  semantics  exist  (see  [7]  for  a  discussion  of  some  of  these  approaches)  not 
much  work  has  been  done  on  prioritizing  well-founded  semantics. 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  121-132,  2001. 

(c)  Springer- Verlag  Berlin  Heidelberg  2001 
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To  be  precise,  we  will  compare  one  of  the  arguably  most  interesting  defeasible 
logics,  an  ambiguity  propagating  variant  presented  in  [3],  with  the  prioritized 
version  of  well-founded  semantics  for  extended  logic  programs.  This  semantics 
was  originally  proposed  in  [6].  In  this  paper  we  use  a  considerably  simplified 
version.  The  simplification  is  possible  since  for  the  purposes  of  this  paper  we  do 
not  need  the  ability  to  represent  preferences  in  the  logical  language. 

The  major  result  of  this  paper  is  the  correctness  of  the  considered  defeasible 
logic  under  the  condition  that  no  defeasible  rule  is  preferred  to  a  strict  rule. 
We  also  investigate  reasons  for  the  incompleteness  of  the  logic  and  argue  that 
the  additional  conclusions  obtained  by  well-founded  semantics  are  desirable.  The 
paper  may  thus  be  read  as  a  critique  of  defeasible  logic. 

The  rest  of  the  paper  is  organized  as  follows.  Sect.  2  describes  the  ambiguity 
propagating  defeasible  logic  used  for  comparison  in  this  paper.  Sect.  3  presents 
a  simplified  version  of  the  preferred  well-founded  semantics  in  [6].  Sect.  4  intro¬ 
duces  the  translation  from  defeasible  logic  to  extended  logic  programs.  Sect.  5 
establishes  the  correctness  result  and  Sect.  6  incompleteness.  Sect.  7  concludes. 

The  analysis  in  the  paper  is  performed  in  a  propositional  setting,  that  is, 
we  consider  propositional  defeasible  theories  and  propositional  extended  logic 
programs. 

2  Defeasible  Logic 

Defeasible  logic  was  first  introduced  by  Nute  [13].  It  is  based  on  strict  rules  of 
the  form  A-^p  and  defeasible  rules  of  the  form  A=^  p.ln  both  cases  A  is  a  set 
of  literals  and  p  a  literal.  We  omit  set  brackets  whenever  A  is  a  singleton  set. 
Facts  are  represented  as  strict  rules  with  empty  set  of  antecedents  (in  which  case 
the  arrow  is  left  out).  Nute  also  introduced  a  third  type  of  rules  called  defeaters 
which  can  block  the  derivation  of  a  literal  without  giving  rise  to  the  derivation 
of  the  complementary  literal.  In  [2]  it  is  shown  that  defeaters  are  not  essential 
in  the  sense  that  they  can  be  simulated  by  the  other  rules.  We  will  therefore  not 
discuss  defeaters  in  this  paper. 

To  solve  conflicts  among  rules  Nute  used  a  preference  relation  >  among  rules: 
r>r'  intuitively  stands  for:  r  has  higher  priority  than  r'.  The  preference  relation 
is  required  to  be  acyclic,  i.e.  its  transitive  closure  must  be  irreflexive. 

Nute’s  original  defeasible  logic  was  not  ambiguity  propagating.  Consider  the 
following  example: 

Example  1: 

1) =¥p 

2)  ~\p 

4)  p  =» 

Since  p  is  not  defeasibly  provable  (because  of  the  conflicting  second  rule) 
rule  4)  is  disregarded  and  q  is  defeasibly  provable  in  Nute’s  logic.  This  seems 
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highly  questionable  since,  although  p  is  not  accepted,  there  is  an  argument  sup¬ 
porting  -tg  which  should  not  be  disregarded  in  a  sceptical  approach. 

For  this  reason  Antoniou  and  colleagues  [3]  defined  an  “ambiguity  propa¬ 
gating  defeasible  logic  without  team  defeat”  which  behaves  as  desired  in  the 
example.  We  consider  this  logic  as  one  of  the  most  interesting  variants  of  defea¬ 
sible  logic  and  use  it  for  our  comparison  in  this  paper. 

A  defeasible  theory  is  a  pair  T  =  (R,  >)  where  i?  is  a  finite  set  of  strict  and 
defeasible  rules  and  >  is  the  preference  relation  among  R.  A  conclusion  of  T 
is  a  tagged  literal:  -\-Aq  means  q  is  strictly  provable,  -\-5q  means  q  is  defeasibly 
provable,  and  +aq  means  q  is  supported.^  The  tags  preceded  by  minus-signs 
stand  for  corresponding  negated  expressions.  A  proof  is  a  finite  sequence  P  of 
tagged  literals.  P{i)  denotes  the  i-th  element  in  the  sequence,  P{l..i)  its  initial 
segment  of  length  i.  The  complement  of  a  literal  q  is  denoted  ~q.  is  the  set 
of  rules  with  head  q,  Rs[q]  the  subset  of  R[q]  consisting  of  all  strict  rules.  The 
antecedents  of  a  rule  r  are  denoted  A(r). 

Inference  rules  are  phrased  as  conditions  on  proofs  as  follows:^ 

-{-A  :  If  P{%  +  1)  =  +Aq  then 

3r  €  Rs[q]  Va  e  A{r)  :  €  P(l..i). 

—A  :  If  P{i  -hi)  —  —Aq  then 

Vr  €  Rs[q]  3a  G  A{r)  :  -Aa  €  P(l..z). 

!  If  P(i  4"  l)  =  ~\~Sq  then 
(1)  +Aq  e  P{l..i)  or 

(2.1)  —A  ~qe  P{l..i)  and 

(2.2)  3r  £  Rlq]  such  that 

Vo  €  A[r]  :  +Sa  G  P(l..z)  and 
Vs  G  R[-q]: 

3a  G  A[s]  :  —aa  G  P(l..f)  or 
r  >  s. 

-5  :  If  P{i  +  1)  —  ~5q  then 
(1)  -Aq  G  P{l..i)  and 

(2.1)  -j-A  —  g  G  P(l..‘i)  or 

(2.2)  Vr  G  R[q]: 

3a  G  A[r]  :  —Sa  G  P(l..i)  or 
3s  G  P[— g]  such  that 

Vo  G  A[s]  :  3-aa  G  P(l..i)  and 
r  s. 

-fa  :  If  P{i  +  1)  =  +aq  then 
-f^g  G  P{l..i)  or 


^  We  use  a  here  rather  than  the  less  readable  and  less  mnemonic  symbol  used  in  [3]. 
^  The  rule  for  -fa  in  [3]  had  a  mistake  in  the  last  line  (G.  Antoniou,  personal  commu¬ 
nication)  which  is  corrected  here. 
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3r  e  i?[g]  such  that 

Va  €  A[r]  :  +aa  e  and 

Vs  €  R[-q]: 

3a  E  -A[s]  :  —5a  E  P{l..i)  or 
s:^  r. 

—a  :  If  P{i  +  1)  =  —aq  then 
—Aq  E  P{l.,i)  and 
Vr  E  R[q]: 

3a  E  A[r]  :  —aa  E  P{l..i)  or 
3s  E  R[—q\  such  that 

Va  e  A[s]  :  +5a  E  P{l..i)  and 
s  >  r. 

Consider  the  following  example: 

Example  2: 

1) p 

2)  p-^q 
S)  q=>r 
4)  -I  r 

Assume  there  are  no  priorities.  Here  is  a  proof  for  +crr: 

+Ap,  +Aq,  -j-aq,  +crr. 

If  we  add  the  preference  4  >  3  the  last  step  in  the  proof  does  not  go  through. 
Indeed,  we  now  have  the  following  proof  for 


—Ar^  +(J-ir. 


3  Prioritized  Well-Founded  Semantics 

In  this  section  we  present  a  simplified  version  of  the  well-founded  semantics  for 
extended  logic  programs  with  priorities  which  was  defined  in  [6].  The  simpli¬ 
fication  is  possible  because  we  are  not  interested  here  in  expressing  preference 
information  in  the  logical  language  (for  a  discussion  why  this  may  be  useful 
see  [6]). 

A  (propositional)  extended  logic  program  is  a  finite  set  of  rules  of  the  form 
c  <  a\ , . . . ,  a-d ,  not  , . . . ,  not  bjji 

where  the  ai,bj  and  c  are  propositional  literals,  i.e.,  either  propositional  atoms 
or  such  atoms  preceded  by  the  classical  negation  sign.  The  symbol  not  denotes 
negation  by  failure  (default  negation),  -i  denotes  strong  negation.  An  extended 
logic  program  is  a  finite  set  P  of  rules.  A  prioritzed  logic  program  is  a  pair 
(P,  >)  where  P  is  an  extended  logic  program  and  >  an  acyclic  preference  relation 
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on  P:  as  in  defeasible  logic  r  >  r'  stands  for  r  is  preferred  over  r'.  In  [6]  >  was 
required  to  be  transitive.  This  restriction  is  not  necessary  and  dropped  here  for 
the  purpose  of  comparison.  Note  also  that  in  the  earlier  paper  the  “smaller” 
rules  were  preferred  rather  than  the  “bigger”  rules  as  in  this  paper. 

Well-founded  semantics  is  an  inherently  sceptical  semantics  that  refrains  from 
drawing  conclusions  whenever  there  is  a  potential  conflict.  Its  original  formula¬ 
tion  for  general  logic  programs  by  Gelder,  Ross  and  Schlipf  [9]  is  based  on  a 
certain  partial  model.  Przymusinski  reconstructed  this  definition  in  3-valued 
logic  [15,16].  A  reformulation  based  on  the  least  fixed  point  of  a  monotone  op¬ 
erator,  namely  the  twofold  application  of  the  Gelfond/Lifschitz  7-operator  [8], 
was  first  given  by  Baral  and  Subrahmanian  [5].  The  straightforward  extension 
of  this  formulation  to  extended  logic  programs  that  underlies  our  approach  was 
used  by  several  authors,  e.g.  [4,10]. 

Let  us  first  introduce  the  7-operator.  We  say  a  rule  r  of  the  form  above  is 
defeated  by  a  literal  I  ii  I  =  hi  for  some  i  G  {1, . . . ,  m}.  We  say  r  is  defeated  by 
a  set  of  literals  X  if  X  contains  a  literal  that  defeats  r. 

Let  P  be  a  logic  program,  X  a  set  of  literals.  The  X-reduct  of  P,  denoted  P^, 
is  the  program  obtained  from  P  by  deleting  each  rule  defeated  by  X.  For  a  set 
of  rules  P,  the  closure  Cl{R)  is  the  smallest  set  of  literals  closed  under  R  and 
the  consequences  Cn{R)  the  smallest  set  of  literals  that  is  (1)  closed  under  P, 
and  (2)  logically  closed,  i.e.,  either  consistent  or  equal  to  the  set  of  all  literals. 
For  the  computation  of  the  closure  we  simply  neglect  default  negated  literals. 

The  Gelfond/Lifschitz  operator  7p  now  is  defined  as  follows: 

■ypiX)  =  Cn{P^) 

For  normal  logic  programs  (that  is  programs  without  strong  negation)  the 
atoms  true  according  to  well-founded  semantics  are  just  the  least  fixed  point 
of  the  twofold  application  of  7.  It  was  argued  in  [6]  that  for  the  extension  of 
well-founded  semantics  to  extended  logic  programs  with  two  kinds  of  negation  it 
is  favourable  to  slightly  modify  the  fixed  point  operator:  rather  than  computing 
the  least  fixed  point  of  7^  Brewka  proposed  to  compute  the  least  fixed  point  of 
77*  where  7*  rather  than  yielding  the  consequences  Cn{P^)  yields  the  closure 
Cl{P^).  This  leads  to  a  larger  set  of  well-founded  conclusions  without  violating 
correctness  wrt  answer  set  semantics. 

The  intuition  behind  well-founded  semantics  can  be  described  as  follows: 
given  a  set  of  literals  S  already  known  to  be  derivable,  7*  (5)  produces  a  set  of 
potential  conclusions  which  still  might  defeat  rules  in  P.  The  conclusions  of  rules 
not  defeated  by  any  of  the  potential  defeaters  are  clearly  derivable.  Starting  with 
the  empty  set,  we  thus  generate  larger  and  larger  sets  S  until  a  fixed  point  is 
reached.  The  following  terminology  reflects  this  intuition: 

Definition  1.  Let  P  he  an  extended  logic  program. 

—  A  literal  I  is  an  S-potential  defeater  iff  I  is  in  the  closure  of  the  rules  in  P 

not  defeated  hy  S. 

—  A  rule  r  is  S-undef eatable  iff  r  is  not  defeated  by  any  S-potential  defeater. 
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-  A  literal  I  is  S -derivable  iff  I  is  a  consequence  of  S -undef eatable  rules  in  P. 

It  is  obvious  that  I  is  5-derivable  iff  I  e  7(7*  (5)).  The  least  fixed  point  of 
77*  is  called  WFS{P),  or  simply  WPS  if  P  is  clear  from  context. 

To  take  preferences  into  account  we  first  introduce  a  notion  of  dominance. 
Intuitively,  a  rule  r  dominates  a  rule  r'  in  the  context  of  a  set  of  literals  S  if  r 
has  higher  priority  and  if  the  application  of  r  in  context  S  actually  defeats  r'.  As 
pointed  out  in  [6]  the  second  condition  is  necessary  to  guarantee  that  prioritized 
well-founded  semantics  is  an  extension  of  well-founded  semantics.  Here  is  the 
formal  definition 

Definition  2.  Let  r  and  P  be  rules,  S  a  set  of  literals.  We  say  r  S -dominates  P 

iff 

1.  r  >  r',  and 

2.  Cl{{r}  U{s  :  s  is  S -undef eatable})  defeats  r' . 

For  the  case  of  prioritized  programs  Def.  1  becomes 
Definition  3.  Let  (P,  >)  be  a  prioritized  logic  program. 

-  A  literal  I  is  an  S-potential  r-defeater  iff  I  is  in  the  closure  of  rules  in  P 
which  are  (1)  not  defeated  by  S  and  (2)  not  S -dominated  by  r. 

-  A  rule  r  is  S-safe  iffr  is  not  defeated  by  any  S-potential  r-defeater. 

~  A  literal  I  is  S -derivable  iff  I  is  a  consequence  of  the  set  of  S-safe  rules  in  P. 

The  definition  for  prioritized  logic  programs  is  different  from  the  one  for  non- 
prioritized  programs  in  two  respects.  Firstly,  there  is  not  a  single  set  of  potential 
defeaters  for  all  rules  but  each  rule  r  has  its  own  set  of  potential  defeaters. 
Secondly,  the  rules  which  are  used  to  derive  potential  defeaters  must  satisfy 
an  additional  condition:  to  potentially  defeat  r  a  rule  must  not  be  dominated 
by  r  in  context  5.  Since  fewer  rules  can  be  used  to  derive  potential  defeaters 
for  a  rule  the  safe  rules  are  a  superset  of  the  undefeatable  rules.  We  thus  obtain 
more  derivable  literals.  For  the  special  case  where  >  is  empty  the  two  definitions 
of  5-derivable  clearly  coincide. 

The  set  of  5-derivable  literals  grows  monotonically  with  5.  We  thus  can  start 
as  usual  with  the  empty  set  of  literals  and  iterate  the  computation  of  5-derivable 
formulas  until  a  fixed  point  is  reached. 

Here  is  a  small  example  illustrating  the  definition. 

Example  3: 

1)  c  ^  not  -ic,  a 

2)  ~>c  <r-  not  c 

3)  a 

Let  1  >  2.  Clearly,  rule  3  is  0-safe  since  there  is  no  way  of  defeating  a  rule  without 
default  negation.  But  also  1)  is  0-safe  since  the  closure  of  1  together  with  the 
0- undefeatable  rule  3  defeats  2  and  thus  1  0-dominates  2.  Therefore  the  set  of 
0-derivable  literals  is  {c,  a}.  This  set  is  already  the  least  fixed  point. 
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4  The  Translation 

We  use  a  straightforward  modular  translation  Trans  from  defeasible  theories 
T  =  (i?,  >)  to  extended  logic  programs  Trans{T)  =  {Trans(R),>')  where 
Trans{R)  =  {Trans{r)  :  r  G  i^}  and  the  translation  of  each  rule  is  defined 
as  follows: 

{ai, . , . ,  an}  — 6  becomes  6^ai,...,an 

{ai, . . .  jttn}  6  becomes  b not  —  6,  ai,...,an 

Furthermore,  we  require  that  Trans{r)  >'  Trans{r')  iff  r  >  r'.  (in  the  rest  of 
the  paper  we  use  the  same  symbol  for  the  two  preference  relations  because  we 
don’t  expect  this  to  cause  any  confusion). 

The  prioritized  logic  programs  obtained  this  way  are  a  proper  subset  of  priori¬ 
tized  extended  logic  programs  which  we  call  defeasible  logic  programs.  Defeasible 
logic  programs  use  default  negation  in  a  highly  restricted  way  (corresponding  to 
normal  defaults  in  default  logic). 

5  (In)  correctness 

We  first  investigate  correctness  of  defeasible  logic  wrt  prioritized  well-founded 
semantics,  that  is,  the  question  whether  for  each  defeasible  conclusion  -{-Sp  of  a 
defeasible  theory  T  we  have  p  e  WFS{Trans{T)).  The  answer  for  the  general 
case  will  be  no,  but  for  a  somewhat  restricted  case  correctness  can  be  established. 

The  negative  answer  for  the  general  case  can  be  demonstrated  by  the  follow¬ 
ing  counterexample  (we  put  the  defeasible  logic  rules  and  their  translation  into 
the  same  line): 

Example  J^: 

1)  p  p  <—  not  -yp 

2)  p->q  q<r-p 

3)  ^  -^q  -ig  4-  not  q 

Assume  3  >  2.  Now  -^S->q  is  a  conclusion  which  can  be  established  through  the 
following  derivation: 


-Ap,  -Aq,  -\-S-^q 

Well-founded  semantics,  on  the  other  hand,  concludes  q  but  not  -^q:  although 
3  has  higher  priority  than  2  it  does  not  dominate  2  since  a  strict  rule  can  never 
be  defeated.  From  this  we  have  the  following  proposition: 

Proposition  1.  Defeasible  logic  is  incorrect  wrt  prioritized  well-founded  seman¬ 
tics:  there  is  a  defeasible  theory  T  =  (R,  >)  and  a  formula  q  such  that  -i-Sq  is  a 
consequence  ofT  but  q  ^  WFS{Trans{T)). 
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The  example  already  hints  at  the  source  of  the  incorrectness.  In  defeasible 
logic  a  strict  rule  can  be  overridden  by  a  defeasible  rule  with  higher  priority.  This 
can  never  happen  in  well-founded  semantics  where  the  conclusion  of  a  strict  rule 
is  accepted  whenever  its  antecedents  are  accepted  no  matter  what  the  preferences 
are.  Indeed,  a  restriction  on  the  admissable  preferences  turns  out  to  be  sufficient 
for  obtaining  correctness. 

Proposition  2,  Let  T  =  (i?,  >)  he  a  defeasible  theory  such  that  >  is  defined  on 
defeasible  rules  only,  Trans{T)  =  (P,  >)  its  translation.  If  -i-5q  is  a  conclusion 
ofT  then  q  G  WFS{Trans{T)). 

Proof.  For  the  proof  we  show  the  following  lemmata:  Let  5  =  S'o  U  U  . . .  be 
the  least  fixed  point  reached  for  Trans{T),  that  is,  5o  =  0  and  for  all  i  >  0,  Si 
is  the  set  of  ^i-i-derivable  literals.  Then  the  following  results  hold: 

Lemma  1.  If  -^Aq  is  a  conclusion  ofT,  then  q  e  Si  (and  henceforth  in  all  Sj 
for  j  >  1). 

Lemma  2.  If  -\-5q  is  a  conclusion  ofT,  then  q  E:  Si  for  some  i  (and  henceforth 
in  all  Sj  for  j  >  i). 

Lemma  3.  If  —aq  is  a  conclusion  ofT,  then  q  is  not  an  S -potential  def eater, 
that  is  there  is  an  i  such  that  q  is  not  an  Si-  potential  def  eater  (and  henceforth 
not  an  Sj -potential  defeater  for  all  j  >i). 

Note  that  our  proposition  is  equivalent  to  Lemma  2.  Lemma  1  is  immediate  since 
strict  rules  can  never  be  defeated. 

The  proof  for  Lemmas  2  and  3  is  by  joint  induction  on  the  length  n  of  the 
shortest  proof  of  the  corresponding  tagged  literals.  The  base  case  can  be  checked 
easily.  For  the  inductive  step  we  assume  that  Lemmas  2  and  3  hold  for  tagged 
literals  whose  shortest  proofs  are  of  length  at  most  n—1.  We  have  to  distinguish 
2  cases  representing  the  possible  tagged  literals  appearing  in  the  lemmata: 

case  -\-Sq: 

there  are  two  alternatives 

a)  -\-Aq  appears  in  the  proof  before  4-^^,  then  according  to  Lemma  1,  q  is  already 
in  Si  and  thus  in  S,  or 

b)  there  is  a  rule  r  with  head  q  whose  antecedents  are,  by  induction  hypothesis, 
in  Sj  for  some  j,  and  for  all  conflicting  rules  r':  an  antecedent  is,  by  induction 
hypothesis,  not  an  ^^-potential  defeater,  for  some  k,  or  r'  is  Sj  dominated  by  r. 
Hence  r  is  5„i-safe  for  sufficiently  large  m  and  thus  q  E  S.  Note  that  for  domi¬ 
nation  to  hold  we  need  the  fact  that  r'  is  a  defeasible  rule,  otherwise  r  could  not 
dominate  r'. 

case  —aq: 

we  know  that  for  each  rule  r  with  head  q  one  of  the  following  2  alternatives 
holds: 

a)  there  is  an  antecedent  which  is,  by  induction  hypothesis,  not  an  5j-potential 
defeater  for  some  j,  or 
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b)  there  is  a  conflicting  rule  s  whose  antecedents  are,  by  induction  hypothesis, 
already  in  Si,  for  some  i,  and  which  5i-dominates  r  (since  s  >  r  rule  r  must 
be  defeasible  and  thus  domination  follows  from  the  fact  that  the  rules  have 
complementary  heads). 

Let  k  be  the  smallest  integer  such  that  for  each  rule  r  with  head  q  an  an¬ 
tecedent  of  r  is  not  an  5fc-potential  defeater  (case  a)  or  r  is  ^fc-dominated  by  a 
rule  with  complementary  head  (case  b).  Let  D  be  the  set  of  rules  with  head  q  for 
which  case  b)  holds  but  not  case  a).  There  are  two  possibilities:  if  D  is  empty 
then  q  is  not  a  potential  5fe-defeater.  If  D  is  not  empty  then,  since  >  is  acyclic, 
there  must  be  a  rule  r'  among  the  rules  S'fc-dominating  elements  of  D  which  is 
itself  not  5fc-dominated  and  thus  5fc-safe.  5^+1  therefore  contains  -q,  all  rules 
for  which  case  b)  holds  are  thus  ^fc+i-defeated  and  q  is  not  an  5^+1 -potential 
conclusion. 


6  Incompleteness 

We  next  discuss  completeness.  It  turns  out  that  defeasible  logic  is  incomplete 
wrt  prioritized  well-founded  semantics: 

Proposition  3.  Let  T  =  {R,  >)  be  a  defeasible  theory,  and  Trans{T)  =  (P,  >) 
its  translation,  q  G  WFS{Trans{T))  does  not  imply  that  -{-Sq  is  a  conclusion 
ofT. 

To  prove  the  proposition  we  will  discuss  some  counterexamples  which  also  illus¬ 
trate  the  sources  of  the  incompleteness. 

Here  is  a  first  counterexample: 

Example  5: 

1)  -ip  -ip  ^  not  p 

2)  p=^p  p^p 

Assume  there  are  no  preferences.  There  is  no  proof  for  -\-5—'p.  Although  the 
conflicting  rule  2  can  never  be  used  to  derive  p,  the  mere  existence  of  the  rule  is 
regarded  as  sufficient  reason  not  to  conclude  ->p.  Well-founded  semantics,  on  the 
other  hand,  concludes  ”ip:  p  is  not  a  potential  0-defeater,  1  is  thus  0-undefeatable 
and  used  to  derive  -^p.  Well-founded  semantics  thus  implicitly  performs  the  kind 
of  loop  checking  which  is  lacking  in  defeasible  logic. 

For  the  next  counterexample  consider  again  the  rules  of  Example  4: 

Example  6: 

1)  p  p  not  -ip 

2)  p-^q  q^p 

3)  -igf  -xq<r-  not  q 

This  time  we  assume  no  priorities.  Clearly  q  £  WFS{Trans{T))  but  -\-6q  is 
not  a  conclusion  of  T.  This  illustrates  a  major  difference  in  the  way  strict  rules 
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are  treated.  In  well-founded  semantics  a  strict  rule  is  applied  whenever  its  an¬ 
tecedents  are  accepted,  independently  of  whether  the  antecedents  are  derived 
using  strict  or  defeasible  rules.  In  defeasible  logic  strict  rules  have  a  different 
role,  depending  on  whether  all  antecedents  have  strict  proofs  or  not.  If  this  is 
the  case,  then  the  rule  is  applied.  If  one  of  the  antecedents  is  only  defeasibly 
derivable  then  the  strict  rule  is  treated  like  a  defeasible  rule  and  may  be  blocked 
by  a  conflicting  defeasible  rule,  as  in  our  example. 

Is  this  ambivalent  role  of  rules  adequate?  In  other  words,  is  well-founded 
semantics  sometimes  not  cautious  enough?  We  do  not  think  so.  As  the  authors 
of  [3]  point  out  “strict  rules  are  intended  to  define  relationships  that  are  defi¬ 
nitional  in  nature”.  The  example  they  give  is  emu[X)  hird[X).  If  there  is 
a  definitional  relationship  between  emus  and  birds  it  seems  fully  adequate  to 
accept,  say  bird{Tweety)  if  emu(Tweety)  is  accepted.  The  additional  conclu¬ 
sions  obtained  by  well-founded  semantics  in  examples  like  the  one  above  seem 
perfectly  reasonable. 

The  next  example  shows  that  well-founded  semantics  takes  more  preferences 
into  account  than  defeasible  logic. 

Example  7: 

1)  P  p  not-^p 

2)  p^-^q  ->g  p 

3)  9  q^  not->q 

4)  q-^-'P  -yp<r-q 

Assume  1  >  3.  Defeasible  logic  does  not  conclude  +(^p.  The  reason  is  that  only 
preferences  of  rules  with  complementary  heads  play  a  role  in  the  proof  theory 
of  defeasible  logic.  Preferences  among  other  rules  are  simply  disregarded.  Well- 
founded  semantics,  on  the  other  hand,  concludes  p  in  the  example.  Rule  1  is 
0-safe  since  the  closure  of  1  together  with  the  0-undefeatable  rule  2  defeats  3, 
that  is  1  0-dominates  3. 

Again  we  believe  that  the  additional  conclusions  obtained  by  well-founded 
semantics  are  perfectly  reasonable. 

7  Conclusions 

In  this  paper  we  have  analyzed  the  relationship  between  defeasible  logic  and 
well-founded  semantics  for  prioritized  extended  logic  programs  with  two  types 
of  negation.  For  the  comparison  we  used  a  straightforward  modular  translation 
of  defeasible  theories  to  extended  logic  programs.  The  analysis  was  based  on  the 
arguably  most  attractive  variant  of  defeasible  logic,  the  ambiguity  propagating 
defeasible  logic  presented  in  [3].  The  prioritized  well-founded  semantics  we  used 
is  a  considerably  simplified  version  of  a  semantics  proposed  in  [6].  The  simplifi¬ 
cation  was  possible  since  for  the  purpose  of  this  paper  the  ability  to  represent 
preference  information  in  the  logical  language  was  not  essential. 

It  turned  out  that,  although  correctness  does  not  hold  in  general,  a  minor 
restriction  is  sufficient  to  guarantee  correctness:  if  we  admit  preferences  between 
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defeasible  rules  only,  then  all  defeasibly  provable  literals  are  true  in  prioritized 
well-founded  semantics.  It  should  be  mentioned  that  the  use  of  the  ambiguity 
propagating  variant  of  defeasible  logic  without  team  defeat  clearly  is  essential 
for  this  result.  Nute’s  original  version  is  obviously  incorrect  (see  Example  1)  as 
any  other  variant  without  ambiguity  propagation. 

We  also  analyzed  the  sources  of  the  incompleteness  of  defeasible  logic.  It 
turned  out  that  three  factors  contribute  to  the  incompleteness:  1)  the  lack  of 
loop  checking,  2)  the  somewhat  ambivalent  role  of  strict  rules  which  -  so  to 
speak  -  turn  into  defeasible  rules  if  not  all  antecedents  are  strictly  provable,  and 
3)  the  preference  handling  which  completely  neglects  preferences  between  rules 
which  do  not  have  complementary  literals. 

Prom  a  semantical  point  of  view  well-founded  semantics  seems  to  have  clear 
advantages:  the  additional  conclusions  obtained  seem  perfectly  reasonable.  More¬ 
over,  in  comparison  with  the  complex  rules  of  defeasible  logic  the  definition  of 
well-founded  semantics  is  quite  simple  and  elegant.  Finally,  the  semantics  is  de¬ 
fined  for  a  much  larger  class  of  programs  than  those  obtained  by  translating 
defeasible  theories. 

What  remains  is  the  computational  aspect.  In  both  approaches  the  compu¬ 
tation  of  conclusions  is  polynomial  in  the  size  of  the  rule  base.  In  [6]  the  variant 
of  well-founded  semantics  where  preference  information  is  expressed  in  the  lan¬ 
guage  is  reported  to  be  of  cubic  complexity.  In  [12]  Nute’s  defeasible  logic  is 
reported  to  be  of  linear  complexity.  It  remains  an  issue  for  further  study  how 
these  results  transfer  to  the  variants  discussed  in  this  paper,  and  whether  there 
are  applications  where  a  possible  computational  advantage  of  defeasible  logic 
can  outweigh  its  semantical  disadvantages. 
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Abstract.  Much  work  has  been  done  on  extending  the  well-founded  se¬ 
mantics  to  general  disjunctive  logic  programs  and  various  approaches 
have  been  proposed.  However,  no  consensus  has  been  reached  about 
which  semantics  is  the  most  intended.  In  this  paper  we  look  at  disjunc¬ 
tive  well-founded  reasoning  from  different  angles.  We  show  that  there 
is  an  intuitive  form  of  the  well-founded  reasoning  in  disjunctive  logic 
programming  which  can  be  equivalently  characterized  by  several  differ¬ 
ent  approaches  including  program  transformations,  argumentation,  un¬ 
founded  sets  (and  resolution-like  procedure).  We  also  provide  a  bottom- 
up  procedure  for  this  semantics.  The  significance  of  this  work  is  not 
only  in  clarifying  the  relationship  among  different  approaches,  but  also 
in  providing  novel  arguments  in  favor  of  our  semantics. 


1  Introduction 

The  importance  of  representing  and  reasoning  about  disjunctive  information  has 
been  addressed  by  many  researchers.  Disjunctive  logic  programming  (DLP)  is 
widely  believed  to  be  a  suitable  tool  for  formalizing  disjunctive  reasoning  and 
it  has  received  extensive  study  in  recent  years.  Since  DLP  admits  both  default 
negation  and  disjunction,  the  issue  of  finding  a  suitable  semantics  for  disjunctive 
programs  is  more  diflhcult  than  it  is  in  the  case  of  normal  (i.  e.  non-disjunctive) 
logic  programs.  Usually,  skepticism  and  credulism  represent  two  major  semantic 
intuitions  for  knowledge  representation  in  artificial  intelligence.  The  well-founded 
semantics  [12]  is  a  formalism  of  skeptical  reasoning  in  normal  logic  programming 
while  the  stable  semantics  [6]  formalizes  credulous  reasoning.  Recently,  consid¬ 
erable  effort  has  been  paid  to  generalize  these  two  semantics  to  disjunctive  logic 
programs.  However,  the  task  of  generalizing  the  well-founded  model  to  disjunc¬ 
tive  programs  heis  proven  to  be  complex.  There  have  been  various  proposals  for 
defining  the  well-founded  semantics  for  general  disjunctive  logic  programs  [8]. 
As  argued  by  some  authors  (for  instance  [2,10,13]),  each  of  the  previous  versions 
of  the  disjunctive  well-founded  semantics  bears  its  own  drawbacks.  Moreover,  no 
consensus  has  been  reached  about  what  constitutes  an  intended  well-founded  se¬ 
mantics  for  disjunctive  logic  programs.  The  semantics  D-WFS  [1,2],  STATIC  [10] 
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and  WFDS  [13]  are  among  the  most  recent  approaches  to  defining  disjunctive 
well-founded  semantics.  D-WFS  is  based  on  a  series  of  abstract  properties  and  it 
is  the  weakest  (least)  semantics  that  is  invariant  under  a  set  of  program  transfor¬ 
mations.  STATIC  has  its  root  in  autoepistemic  logic  and  is  based  on  the  notion  of 
static  expansions  for  belief  theories.  The  semantics  STATIC(P)  for  a  disjunctive 
program  P  is  defined  as  the  least  static  expansion  of  Faeb  where  Paeb  is  the 
belief  theory  corresponding  to  P.  The  basic  idea  of  WFDS  is  to  transform  P  into 
an  argumentation  framework  and  WFDS(P)  is  specified  by  the  least  acceptable 
hypothesis  of  P.  Although  these  semantics  stem  from  very  Afferent  intuitions,  all 
of  them  share  a  number  of  attractive  properties.  In  particular,  each  of  these  se¬ 
mantics  extends  both  the  well-founded  semantics  [12]  for  normal  logic  programs 
and  the  generalized  closed  world  assumption  (GCWA)  [9]  for  positive  disjunctive 
programs  (i.  e.  without  default  negation). 

It  has  been  proven  that  D-WFS  is  equivalent  to  a  restricted  version  of 
STATIC  [3].  But  the  relation  of  these  semantics  to  the  argumentation-based 
semantics  and  unfounded  sets  are  as  yet  unclear.  In  this  paper,  we  modify  some 
existing  semantics  to  make  them  more  intuitive  and  report  further  equivalence 
results.  First,  we  define  a  transformation-based  semantics  denoted  D-WFS*  by 
introducing  a  new  transformation  into  Brass  and  Dix’s  set  Twfs  of  program 
transformations.  This  semantics  naturally  extends  D-WFS  and  enjoys  all  the 
important  properties  that  have  been  proven  for  D-WFS.  We  prove  that  WFDS 
is  equivalent  to  D-WFS*.  We  also  provide  a  bottom- up  evaluation  procedure 
for  WFDS  (and  D-WFS*).  Second,  we  define  a  new  notion  of  unfounded  sets 
which  is  a  generalization  of  the  unfounded  sets  defined  in  [7,5].  Based  on  this 
new  notion  of  unfounded  sets,  we  define  a  well-founded  semantics  U-WFS  for 
disjunctive  programs.  We  show  that  U-WFS  is  equivalent  to  WFDS  (and  thus 
D-WFS  ).  Moreover,  in  [14]  we  have  developed  a  top-down  procedure  D-SLS 
Resolution  which  is  sound  and  complete  with  respect  to  our  semantics.  D-SLS 
extends  both  SLS-resolution  [11]  and  SLI-resolution  [8].  Altogether  we  obtain 
the  following  equivalence  results: 


WFDS  =  D-WFS*  =  U-WFS  =  D-SLS. 


We  consider  these  results  to  be  quite  significant:  (1)  Our  results  clarify  the 
relationship  among  quite  several  different  approaches  to  defining  disjunctive 
well-founded  semantics,  including  argumentation-based,  transformation-based, 
unfounded  sets-based  and  resolution-based  approaches.  (2)  Since  the  four  se¬ 
mantics  are  based  on  very  different  intuitions,  these  equivalent  characterizations 
in  turn  provide  yet  more  powerful  arguments  in  favor  of  our  semantics.  (3)  Both 
the  top-down  procedure  D-SLS  Resolution  [14]  and  the  bottom-up  query  eval¬ 
uation  proposed  in  this  paper  pave  two  different  ways  for  implementing  our 
semantics. 

The  rest  of  this  paper  is  arranged  as  follows.  In  Section  2  we  recall  some 
basic  definitions  and  notation;  we  present  in  Section  3  a  slightly  restricted  form 
of  the  well-founded  semantics  WFDS.  In  Section  4  we  introduce  a  new  pro¬ 
gram  transformation  Head  reduction  and  then  define  the  transformation-based 
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semantics  D-WFS*,  which  naturally  extends  D-WFS.  In  Section  5,  we  first  pro¬ 
vide  a  bottom-up  query  evaluation  for  D-WFS*  (and  WFDS)  and  then  prove 
the  equivalence  of  D-WFS*  and  WFDS.  Section  6  introduces  the  new  notion  of 
unfounded  sets  and  defines  the  well-founded  semantics  U-WFS.  We  also  show 
that  U-WFS  is  equivalent  to  WFDS.  Section  7  is  our  conclusion.  Proofs  of  the 
theorems  are  given  in  the  full  version  of  this  paper. 


2  Preliminaries 

We  briefly  review  most  of  the  basic  notions  used  throughout  this  paper. 

A  disjunctive  logic  program  is  a  finite  set  of  rules  of  the  form 

ai  V  •  •  •  V  ttn  <-  6i, . . . ,  6m,  Cl, . . . ,  not  Cu  (1) 

where  Uf,  6i,  ci  are  atoms  and  n  >  0.  The  default  negation  ^not  a’  of  an  atom  a  is 
called  a  negative  literal.  In  this  paper  we  consider  only  propositional  programs 
although  many  definitions  and  results  hold  for  predicate  logic  programs. 

P  is  a  normal  logic  program  if  it  contains  no  disjunctions. 

If  a  rule  of  form  (1)  contains  no  negative  body  literals,  it  is  called  positive^  P 
is  a  positive  program  if  every  rule  of  P  is  positive. 

If  a  rule  of  form  (1)  contains  no  body  atoms,  it  is  called  negative;  P  is  a 
negative  program  if  every  rule  of  P  is  negative. 

Following  [2],  we  also  say  a  negative  rule  r  is  a  conditional  fact  That  is,  a 
conditional  fact  is  of  form  ai  V  •  ♦  ♦  V  a„  not  ci,  •  •  • ,  not  Cm,  where  and  Cj 
are  (ground)  atoms  for  1  <  fc  <  n  and  0  <  j  <m. 

For  a  rule  r  of  form  (1),  body{r)  =  body^{r)  U  body~{r)  where  body'^{r)  = 
{6i, . . . ,  6m}  and  body~(r)  =  {not  ci, . . . ,  not  c*};  head{r)  =  ai  V  ■  •  •  V  Un-  When 
no  confusion  is  caused,  we  also  use  head{r)  to  denote  the  set  of  atoms  in  head[r). 
For  instance,  a  €  head{r)  means  that  a  appears  in  the  head  of  r.  If  X  is  a  set 
of  atoms,  head{r)  —  X  is  the  disjunction  obtained  from  head{r)  by  deleting  the 
atoms  in  X.  The  set  head(P)  consists  of  all  atoms  appearing  in  rule  heads  of  P. 

As  usual.  Bp  is  the  Herbrand  base  of  disjunctive  logic  program  P,  that  is,  the 
set  of  all  (ground)  atoms  in  P.  A  positive  (negative)  disjunction  is  a  disjunction 
of  atoms  (negative  literals)  in  P.  A  pure  disjunction  is  either  a  positive  one  or 
a  negative  one.  The  disjunctive  base  of  P  is  DBp  =  DBp  U  DBp  where  DBp 
is  the  set  of  all  positive  disjunctions  in  P  and  DBp  is  the  set  of  all  negative 
disjunctions  in  P.  If  A  and  B  =  A  V  A'  are  two  disjunctions,  then  we  say  A  is  a 
sub- disjunction  of  B,  denoted  A  C  B. 

A  model  state  of  a  disjunctive  program  P  is  a  subset  of  DBp.  Usually,  a 
well-founded  semantics  for  a  disjunctive  logic  program  is  defined  by  a  model 
state. 

If  S  is  an  expression  (a  set  of  literals,  a  disjunction  or  a  set  of  disjunctions), 
atoms(S)  denotes  the  set  of  all  atoms  appearing  in  S. 

For  simplicity,  we  assume  that  all  model  states  are  closed  under  implication 
of  pure  disjunctions.  That  is,  for  any  model  state  S',  if  A  is  a  sub-disjunction  of 
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a  pure  disjunction  B  and  ^  e  5,  then  B  e  S.  For  instance,  if  B  =  {a,6  V  c}, 
then  a  V  5  V  c  G  5. 

Given  a  model  state  S  and  a  pure  disjunction  A,  we  also  say  A  is  satisfied 
by  S,  denoted  S  \=  A,if  Ag  S. 

We  assume  that  all  disjunctions  have  been  simplified  by  deleting  the  repeated 
literals.  For  example,  the  disjunction  a  V  6  V  6  is  actually  the  disjunction  a  V  6. 

3  Argumentation  and  Well-Founded  Semantics 

As  illustrated  in  [13]^,  argumentation  provides  an  unifying  semantic  framework 
for  DLP.  The  basic  idea  of  the  argumentation-based  approach  for  DLP  is  to 
translate  each  disjunctive  logic  program  into  an  argument  framework  Fp  = 
(P,DBp,'^p}.  In  the  framework  defined  in  [13],  an  assumption  of  P  is  a  nega¬ 
tive  disjunction  of  P,  and  a  hypothesis  is  a  set  of  assumptions;  is  an  attack 
relation  among  the  hypotheses.  An  admissible  hypothesis  A  is  one  that  can  at¬ 
tack  every  hypothesis  which  attacks  it.  The  intuitive  meaning  of  an  assumption 
not  ai  V  ♦ '  •  V  not  am  is  that  ai  A  •  •  •  A  can  not  be  proven  from  the  disjunctive 
program. 

Given  a  hypothesis  A  of  disjunctive  program  P,  similar  to  the  GL- transfor¬ 
mation  [6],  we  can  easily  reduce  P  into  another  disjunctive  program  without 
default  negation. 

Definition  1.  Let  A  be  a  hypothesis  of  disjunctive  program  P,  then  the  reduct 
of  P  with  respect  to  A  is  the  disjunctive  program 

—  {head{r)  <—  body'^{r)  |  r  G  P  and  body~{r)  C  A}. 

The  following  definition  introduces  a  special  resolution  h p  which  resolves  default- 
negation  literals  with  a  disjunction. 

Definition  2.  Let  A  be  a  hypothesis  of  disjunctive  program  P  and  A  G  DBp.  If 
there  exists  B  G  DB^  and  not  6i, . . . ,  not  bm  e  A  such  that  B  =  AV6i  V-  •  • 
and  P^  1=  B.  Then  A  is  said  to  be  a  supporting  hypothesis  for  A,  denoted 
AhpA.  Here  |=  is  the  inference  relation  of  the  classical  propositional  logic. 

The  set  of  all  positive  disjunctions  supported  by  A  is  denoted: 

consp{A)  =  {A  G  DB^  \  AY-pA}. 

To  derive  suitable  hypotheses  for  a  given  disjunctive  program,  some  constraints 
will  be  required  to  filter  out  unintuitive  hypotheses. 

Definition  3.  Let  A  and  A!  be  two  hypotheses  of  disjunctive  program  P.  If  at 
least  one  of  the  following  two  conditions  holds: 

^  You  et  al  in  [15]  also  defined  an  argumentative  extension  to  the  disjunctive  sta¬ 
ble  semantics.  However,  their  framework  does  not  lead  to  an  intuitive  well-founded 
semantics  for  DLP  as  the  authors  have  observed. 


A  Comparative  Study  of  Well-Founded  Semantics  137 


1.  there  exists  P  =  not  bi  W  -  ^  W  not  bm  ^  m  >  0,  such  that  A\-pbi,  for 
alii  —  1,.. .  ,m;  or 

2.  there  exist  not  6i, . . . ,  not  bm  C  A\  m  >  0,  such  that  A\-pbi  V  •  •  •  V  bm, 
then  we  say  A  attacks  A' ,  and  denoted  A'^p  A' . 

Intuitively,  A  '^p  A'  means  that  A  causes  a  direct  contradiction  with  A^  and 
the  contradiction  may  come  from  one  of  the  above  two  cases. 

Example  1. 


aVb 

c<—  d,  not  a,  not  b 
d^ 

e  not  e 

Let  A'  =  {not  c}  and  A  =  {not  a,  not  6},  then  A  ^p  A! . 

The  next  definition  specifies  what  is  an  acceptable  hypothesis. 

Definition  4.  Let  A  be  a  hypothesis  of  disjunctive  program  P.  An  assumption 
B  of  P  is  admissible  with  respect  to  A  if  A^pA'  holds  for  any  hypothesis  A! 
of  P  such  that  A'  '^p  {B}. 

Denote  Ap{A)  ~  {not  ai  V  -- V  not  a^i  C  DBp  \  not  ai  is  admissible  wrt 
A  for  some  i,l  <i  <  m}. 

Originally,  Ap  also  includes  some  other  negative  disjunctions.  To  compare  with 
different  semantics,  we  omit  them  here.  Another  reason  for  doing  this  is  that 
information  in  form  of  negative  disjunctions  does  not  participate  in  inferring 
positive  information  in  DLP, 

For  any  disjunctive  program  P,  Ap  is  a  monotonic  operator.  Thus,  if  P  is 
finite  then  Ap  has  the  least  fixpoint  lfp(.4p)  and  lfp(«4p)  =  Ap{^)  for  some 
A;>0. 

Definition  5.  The  well-founded  disjunctive  hypothesis  WFDH{P)  of  disjunc¬ 
tive  program  P  is  defined  as  the  least  fixpoint  of  the  operator  Ap.  That  is, 
WFDH{P)  =  Ap{u;. 

The  well-founded  disjunctive  semantics  WFDS  for  P  is  defined  as  the  model 
state  WFDS{P)  =  WFDH{P)  U  consp{WFDH{P)). 

By  the  above  definition,  WFDS(P)  is  uniquely  determined  by  WFDH(P). 

For  the  disjunctive  program  P  in  Example  1,  WFDH(P)  =  {not  c}  and 
WFDS(P)  =  {a  V  6,  d,  not  c}.  Notice  that  e  is  unknown. 

4  Transformation-Based  Semantics 

In  this  section  we  study  the  relation  of  the  argumentation-based  semantics  to 
the  transformation-based  semantics.  We  first  introduce  a  new  program  transfor¬ 
mation  so  as  to  simplify  the  rule  heads  of  disjunctive  programs  and  then  define 
a  new  transformation-based  semantics  (called  D-WFS*)  as  the  most  skeptical 
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semantics  that  satisfies  both  our  new  program  transformation  and  Brass  and 
Dix’s  set  Twfs  of  program  transformations.  Our  new  semantics  D-WFS*  nat¬ 
urally  extends  the  D-WFS  in  [2]  and  is  no  less  skeptical  than  D-WFS.  In  fact, 
this  extension  is  meaningful  because  D-WFS  seems  too  skeptical  to  derive  useful 
information  firom  some  disjunctive  programs  as  the  next  example  shows. 

Example  2.  John  is  traveling  in  Europe  but  we  are  not  sure  which  city  he  is 
visiting.  We  know  that,  if  there  is  no  evidence  to  show  that  John  is  in  Paris, 
he  should  be  either  in  London  or  in  Berlin.  Also,  we  are  informed  that  John  is 
now  visiting  either  London  or  Paris.  This  knowledge  base  can  be  conveniently 
expressed  as  the  following  disjunctive  logic  program  P: 

bW  I  not  p 
lVp<^ 

Here,  6, 1  and  p  denote  that  John  is  visiting  Berlin,  London  and  Paris,  respec¬ 
tively. 

Intuitively,  not  b  {i.  e.  John  is  not  visiting  Berlin)  should  be  inferred  from  P. 
It  can  be  verified  that  neither  b  nor  its  negation  not  b  can  be  derived  from  P 
under  D-WFS  and  STATIC  while  not  b  can  be  derived  under  WFDS. 

The  intuition  behind  Minker’s  Generalized  Closed  World  Assumption  (GCWA) 
[9]  can  be  read  off  its  proof-theoretic  characterization: 

If,  for  every  positive  disjunction  A,  P\-  aV  A  implies  P  h  A,  then  not  a  is 
derivable  from  P,  where  h  is  the  inference  relation  in  the  classical  logic  and  P 
is  considered  as  a  classical  logic  theory. 

The  above  principle  for  positive  DLP  can  be  reformulated  in  general  DLP  as 
follows: 

If,  for  every  conditional  fact  ay  A  not  C ,  P  Y-  (ay  A  <r~  not  C)  implies 
P  h  (A  -f—  not  C),  then  not  a  is  derivable  from  P,  where  h  is  the  inference 
relation  in  the  classical  logic  and  P  is  considered  as  a  classical  logic  theory. 

However,  D-WFS  does  not  obey  the  above  principle  as  Example  2  shows.  In 
fact,  P  h  (6  V  /  <—  not  p)  implies  P  \-  (I  <r-  not  p)  since  /  V  p  is  in  P.  But 
b  i  D-WFS(P). 

According  to  [2] ,  an  abstract  semantics  can  be  defined  as  follows. 

Definition  6.  A  semantics  S  is  a  mapping  which  assigns  to  every  disjunctive 
program  P  a  setS(P)  of  pure  disjunctions  such  that  the  following  conditions  are 
satisfied: 

1.  if  Q'  is  a  sub- disjunction  of  pure  disjunction  Q  and  Q'  e  S{P),  then  Q  e 

S{P); 

2.  if  the  rule  A  <r~  is  in  P  for  a  (positive)  disjunction  A,  then  A  E  «S(P); 

3.  if  a  is  an  atom  and  a  ^  head(P)  (i.  e.  a  does  not  appear  in  the  rule  heads 
of  P),  then  not  a  E  s(p). 

It  should  be  noted  that  a  semantics  satisfying  the  above  conditions  is  not  nec¬ 
essarily  a  suitable  one  because  Definition  6  is  still  very  general. 
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Besides  the  program  transformations  Twfs  in  [2] ,  we  also  need  a  new  pro¬ 
gram  transformation  called  Head  reduction  to  define  our  semantics.  This  defi¬ 
nition  is  designed  just  to  reflect  the  semantic  intuition  behind  the  GCWA  as 
mentioned  at  the  beginning  of  this  section. 

Definition  7.  An  atom  a  in  disjunctive  program  P  is  called  GCWA-negated  if, 
for  any  rule  r  in  P  of  form  aV  A  <r-  B,  not  ci, . . . ,  not  Ct,  there  is  a  rule  A'  <— 
in  P  such  that  A'  is  a  sub- disjunction  o/  A  V  ci  V  •  •  •  V  Cf . 

For  instance,  b  can  be  GCWA-negated  for  the  disjunctive  program  in  Example  2. 


Definition  8.  A  ruler  is  an  implication  of  another  rule  P  ifhead{r')  Chead{r), 
body{r')  C  body{r)  and  at  least  one  inclusion  is  proper. 

The  definition  of  our  new  semantics  D-WFS*  will  be  based  on  the  set  T^pg  of  the 
following  six  program  transformations.  In  the  sequel,  Pi  and  P2  are  disjunctive 
programs: 

-  Unfolding:  P2  is  obtained  from  Pi  by  unfolding  if  there  is  a  rule  A  <- 
b,  B,  not  C  in  Pi  such  that 

P2  =  Pi  —  {A  ^  b,  B,  not  C) 

U{ A  V  (A'  -  {b})  ^  P,  P',  not  C,  not  C')  \ 

there  is  a  rule  of  Pi  :  A'  P',  not  C'  such  that  6  G  A'}. 

-  Elimination  of  tautologies:  P2  is  obtained  from  Pi  by  elimination  of 
tautologies  if  there  is  a  rule  A  <—  P,  not  C  in  Pi  such  that  A  Pi  P  ^  0 
and  P2  =  Pi  -  {A  <—  P,  not  C). 

-  Elimination  of  nonminimal  rules:  P2  is  obtained  from  Pi  by  elimination 
of  nonminimal  rules  if  there  are  two  distinct  rules  r  and  r'  of  Pi  such  that  r 
is  an  implication  of  r'  and  P2  =  Pi  —  {r}. 

-  Positive  reduction:  P2  is  obtained  from  Pi  by  positive  reduction  if  there 
is  a  rule  A  B,  not  C  in  Pi  and  c  G  C  such  that  c  ^  head{Pi)  and  P2  = 
Pi  -  {A  ^  P,  not  C}  U  {A  P,  not  {C  —  {c})}. 

-  Negative  reduction:  P2  is  obtained  from  Pi  by  negative  reduction  if  there 
are  two  rules  A  <—  P,  not  C  and  A'  't—  in  Pi  such  that  A'  C  C  and  P2  = 
Pi  -  {A  P,  not  C}. 

-  Head  reduction  P2  is  obtained  from  Pi  by  head  reduction  if  there  is  a  rule 
a  V  A  ■<—  P,  not  C  in  Pi  such  that  a  is  GCWA-negated  and  P2  =  Pi  U  { A  <— 
P,  not  C}  —  {a  V  A  ^  P,  not  C}. 

Example  3.  Consider  the  disjunctive  program  P  in  Example  2.  Since  the  atom  b 
is  GCWA-negated,  P  can  be  transformed  into  the  following  disjunctive  pro¬ 
gram  P’  by  Head  reduction: 

I  •(—  not  p 
ly  p<r- 

Suppose  that  <S  is  a  semantics.  Then  by  Definition  6,  /  V  p  G  5  and  not  6  G  <S. 
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We  say  a  semantics  S  satisfies  a  program  transformation  T  (or,  S  is  invariant 
under  T)  if  «S(Pi)  =  5(P2)  for  any  two  disjunctive  programs  Pi  and  P2  with  P2  = 
r(Pi). 

Let  S  and  <S'  be  two  semantics.  S  is  weaker  than  S'  if  «S(P)  C  S'{P)  for  any 
disjunctive  program  P. 

We  present  the  main  definition  of  this  section  as  follows. 

Definition  9.  (D-WFST )  The  semantics  D-WFS*  for  disjunctive  programs  is 
defined  as  the  weakest  semantics  allowing  all  program  transformations  in 

This  definition  is  not  constructive  and  thus  it  can  not  be  directly  used  to  compute 
the  semantics  D-WFS*  (a  bottom-up  procedure  will  be  given  in  the  next  section). 
In  the  rest  of  this  section,  we  first  look  at  some  properties  of  D-WFS* . 

As  the  following  theorem  shows,  D-WFS*  (P)  is  well-defined  for  every  dis¬ 
junctive  program  P.  This  is  guaranteed  by  the  following  two  lemmas. 

Lemma  1,  There  is  a  semantics  that  satisfies  all  the  program  transformations 
in 

Lemma  2.  Let  Si  and  S2  be  two  semantics  satisfying  Then  their  inter¬ 

section  S  =  Si  nS2  is  also  a  semantics  and  satisfies 

Therefore,  we  have  the  following  result  which  shows  that  semantics  D-WFS* 
assigns  the  unique  model  state  D-WFS*  (P)  for  each  disjunctive  program  P. 

Theorem  1.  For  any  disjunctive  program  P,  D-WFS'{P)  is  well-defined. 

Since  the  set  Twfs  of  program  transformations  in  [2]  is  a  subset  of  T^pg,  our 
D-WFS*  extends  the  original  D-WFS  in  the  following  sense. 

Theorem  2.  Let  P  he  a  disjunctive  program.  Then 

D-WFS{P)  C  D-WFS^{P). 

The  converse  of  Theorem  2  is  not  true  in  general.  As  we  will  see  in  Section  5,  for 
the  disjunctive  program  P  in  Example  2,  not  b  €  D-WFS*  (P)  but  not  b  ^ 
D-WFS(P).  This  theorem  also  implies  that  D-WFS*  extends  the  restricted 
STATIC  since  the  D-WFS*  is  equivalent  to  the  restricted  STATIC  [3]. 

5  Bottom-Up  Computation 

Parallel  to  the  computation  for  D-WFS  [2],  we  will  first  provide  a  bottom- up 
procedure  for  D-WFS*  and  then  show  the  equivalence  of  D-WFS*  and  WFDS. 
As  a  result,  we  actually  provide  a  bottom-up  computation  for  WFDS. 

Let  P  be  a  disjunctive  program.  Our  bottom-up  computation  for  D-WFS*  (P) 
consists  of  two  stages.  At  the  first  stage,  P  is  equivalently  transformed  into  a 
negative  program  Lft(P)  called  the  least  fixpoint  transformation.  The  details  of 
this  transformation  can  be  found  in  [2,13].  The  basic  idea  is  to  first  evaluate  body 
atoms  of  the  rules  in  P  but  delay  the  negative  body  literals.  The  second  stage 
is  to  further  simplify  Lft(P)  into  res*(P)  from  which  the  semantics  D-WFS*  (P) 
can  be  directly  read  off. 
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5.1  Strong  Residual  Program 

In  general,  the  negative  program  Lft(P)  can  be  further  simplified  by  deleting 
unnecessary  rules,  unnecessary  body  literals  and  unnecessary  head  atoms.  This 
leads  to  the  idea  of  so-called  reductions^  which  was  firstly  studied  in  [4]  and  then 
generalized  to  the  case  of  disjunctive  logic  programs  in  [2].  The  reduction  of  a 
disjunctive  program  P  is  called  the  residual  program  of  P.  The  following  is  a 
generalization  of  Brass  and  Dix’s  residual  programs. 

Let  iVccwA  be  the  set  of  atoms  that  are  GCWA-negated  in  disjunctive  pro¬ 
gram  N.  The  reduction  operator  R*  is  defined  as,  for  any  negative  program  N 
(i.  e.  a  set  of  conditional  facts), 

R*(N)  =  {  (A  -  a)  <—  not  {C  Pi  head{N))  \ 

there  is  rule  r  e  N  :  A  <r-  not  C  such  that 

(1)  no  rule  of  form  (A'  -<— )  with  A'  C  C, 

(2)  no  rule  r'  s.t.  r  is  an  implication  of  r', 

(3)  a  e  ATgcwa  and  A  -  Nqcwa  ^  0}- 

The  notion  of  the  implication  of  rules  can  be  found  in  Definition  8.  For  any 
disjunctive  program  P,  we  can  first  transform  it  into  the  negative  disjunctive 
program  Lft(P).  Then,  fully  perform  the  reduction  R*  on  Lft(P)  to  obtain  a 
simplified  negative  program  res*(P)  (the  strong  residual  program  of  P).  The 
iteration  procedure  of  R*  will  finally  stop  in  finite  steps  because  Bp  contains 
finite  number  of  atoms  and  the  total  number  of  atoms  occurring  in  each  N  is 
reduced  by  R*.  This  procedure  is  precisely  formulated  in  the  next  definition, 
which  has  the  same  form  as  Definition  3.4  in  [2]  (the  difference  is  only  in  that 
we  have  a  new  reduction  operator  R*  here). 

Definition  10.  (strong  residual  program)  Let  P  be  a  disjunctive  program.  Then 
we  have  a  sequence  of  negative  programs  {Ni}i>Q  with  No  =  Lft{P)  and  Ni.^i  = 
R*{Ni).  Let  Wt+i  =  R*{Nt).  Then  we  call  Nt  is  the  strong  residual  program  of  P 
and  denote  it  as  res*(P). 

Since  the  Head  reduction  has  been  directly  embedded  into  the  operator  P*,  the 
following  result  can  be  obtained  from  Theorem  4.3  in  [2],  which  guarantees  the 
completeness  of  our  bottom-up  computation. 

Theorem  3.  Let  P  and  P'  be  two  disjunctive  programs.  If  P  is  transformed 
into  P'  by  a  program  transformation  in  then  res*{P)  =  res*(P'). 

This  theorem  has  the  following  interesting  corollary. 

Corollary  1.  Let  S  be  a  semantics  satisfying  S{P)  =  «S(re5*(P))  for  all  dis¬ 
junctive  program  P.  Then  S  allows  all  program  transformations  in  T\^pg. 

This  corollary  implies  that,  if  «So  is  a  mapping  from  the  set  of  all  strong  residual 
programs  to  the  set  of  model  states  and  it  satisfies  all  properties  in  Definition  6, 
then  the  mapping  defined  by  S(P)  =  5(res*(P))  is  a  semantics.  Therefore,  the 
following  lemma  is  obtained  from  the  fact  that  D-WFS*  is  the  weakest  semantics. 
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Lemma  3.  Given  disjunctive  program  P,  we  have 

D-WFS^{res*{P))  =  D-WFS:^{P)  U  D-WFSl{P) 

where 

D-WFS'+(res*(P))  =  {A  e  DB+  |  rule  A!  r-  is  in  res*  (F) 

for  some  sub- disjunction  A'  of  A} 

D-WFS*_{res*{P))  =  {A  G  DBp  \  if  a  ^  head{res* {P)) 

for  some  atom  a  appearing  in  A.} 

Thus,  for  any  disjunctive  program  P,  it  is  an  easy  task  to  get  the  semantics 
D-WFS*(res*(P))  of  its  strong  residual  program. 

The  main  theorem  in  this  section  can  be  stated  as  follows. 

Theorem  4.  For  any  disjunctive  program  P,  we  have 

D-  WFS*  (P)  =  D-  WFS*^  (P)  U  D-  WFS*_  (P) 

where 

D-WFS^{P)  =  {A  G  DBp  I  rule  A!  ^  is  in  res*{P) 

for  some  sub- disjunction  A!  of  A} 

D-WF^_{P)  =  {A  e  DBp  \  if  a  ^  head{res*{P)) 

for  some  atom  a  appearing  in  A.} 

Example  4-  Consider  again  the  disjunctive  program  P  in  Example  2.  The  strong 
residual  program  res*(P)  is  as  follows: 

I  <—  not  p 
IV  p^ 

Thus,  D-WFS*(P)  -  {/  Vp,  not  6}  2. 


5.2  Equivalence  of  WFDS  and  D-WFS* 

Before  we  present  the  main  theorem  of  this  section,  we  need  some  properties  of 
WFDS.  First,  we  can  justify  that  WFDS  is  a  semantics  in  the  sense  of  Defini¬ 
tion  6.  Moreover,  it  possesses  the  following  two  properties  which  can  be  verified 
directly. 

Proposition  1.  WFDS  satisfies  all  program  transformations  in  T^y^^. 

^  D-WFS*  (P)  should  include  all  pure  disjunctions  implied  by  either  I  V  p  or  not  b. 
However,  the  little  abusing  of  notion  here  simplifies  our  notation. 
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This  proposition  implies  that  the  argumentation-based  semantics  WFDS  is  al¬ 
ways  at  least  as  strong  as  the  transformation-based  semantics  D-WFS*. 

The  next  result  convinces  that  the  strong  residual  program  res*{P)  of  dis¬ 
junctive  program  P  is  equivalent  to  P  w.r.t.  the  semantics  WFDS.  Therefore, 
we  can  first  transform  P  into  res*{P)  and  then  compute  WFDS  (res*  (P)). 

Proposition  2.  For  any  disjunctive  program  P, 

WFDS{P)  =  WFDSires*{P)), 

It  has  been  shown  in  [2]  that  Lft  and  their  reduction  operator  R  can  be  simulated 
by  Twfs  =  T^pg— {Head  reduction},  we  have  that  Lft  and  R*  can  be  simulated 
by  T^pg.  Thus,  the  above  proposition  holds. 

Now  we  can  state  the  main  result  of  this  section,  which  asserts  the  equivalence 
of  D-WFS*  and  WFDS. 

Theorem  5.  For  any  disjunctive  logic  program  P, 

WFDS{P)  =  D-WFSr{P). 

An  important  implication  of  this  result  is  that  the  well-founded  semantics  WFDS 
also  enjoys  a  bottom- up  procedure  similar  to  the  D-WFS. 


6  Unfounded  Sets 

The  first  definition  of  the  well-founded  model  [12]  is  given  in  term  of  unfounded 
sets  and  it  has  been  proved  that  the  notion  of  unfounded  sets  constitutes  a 
powerful  and  intuitive  tool  for  defining  semantics  for  logic  programs.  This  notion 
has  also  been  generalized  to  characterizing  stable  semantics  for  disjunctive  logic 
programs  in  [7,5].  However,  the  two  kinds  of  unfounded  sets  defined  in  [7,5] 
can  not  be  used  to  define  an  intended  well-founded  semantics  for  disjunctive 
programs. 

Example  5.  ^ 

a  V  5  <— 

c  not  a,  not  b 

Intuitively,  not  c  should  be  derived  from  the  above  disjunctive  program  and  actu¬ 
ally,  many  semantics  including  DWFS,  STATIC  and  WFDS  assign  a  truth  value 
^false‘  for  c.  However,  according  to  the  definitions  of  unfounded  sets  in  [7,5],  c  is 
not  in  any  n-fold  application  of  the  well-founded  operators  on  the  empty  set.  For 
this  reason,  a  more  reasonable  definition  of  the  unfounded  sets  for  disjunctive 
programs  is  in  order. 

In  this  section,  we  will  define  a  new  notion  of  unfounded  sets  for  disjunc¬ 
tive  programs  and  show  that  the  well-founded  semantics  U-WFS  defined  by  our 
notion  is  equivalent  to  D-WFS*  and  WFDS. 


^  This  example  is  due  to  Jurgen  Dix  (personal  communication) 
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We  say  body{r)  of  r  6  P  is  true  wrt  model  state  S,  denoted  S  f=  body{r)^  if 
body{r)  C  5;  body{r)  is  false  wrt  model  state  5,  denoted  S  |=  ^body{r)  if  either 
(1)  the  complement  of  a  literal  in  body{r)  is  in  S  or  (2)  there  is  a  disjunction  ai  V 
•  •  •  V  ttn  G  5  such  that  {not  ai, . . . ,  not  an}  Q  body{r). 

In  Example  5,  the  second  rule  is  false  wrt  S'  =  {a  V  6}. 

Definition  11.  Let  S  be  a  model  state  of  disjunctive  program  P,  a  set  X  of 
ground  atoms  is  an  unfounded  set  for  P  wrt  S  if,  for  each  aG  X  and  each  rule 
r  G  P  such  that  a  G  head{r),  at  least  one  of  the  following  conditions  holds: 

1.  the  body  of  r  is  false  wrt  S; 

2.  there  is  x  G  X  such  that  x  G  body'^{r); 

3.  if  body{r),  then  S  \=  {head{r)  —  X). 

Notice  that  the  above  definition  generalized  the  notions  of  unfounded  sets  in  [7,5] 
in  two  ways.  Firstly,  the  original  ones  are  defined  only  for  interpretations  (sets 
of  ground  literals)  rather  than  for  model  states.  An  interpretation  is  a  model 
state  but  not  vice  versa.  Secondly,  though  one  can  redefine  the  original  notions  of 
unfounded  sets  for  model  states,  such  unfounded  sets  are  still  too  weak  to  capture 
the  intended  well-founded  semantics  of  some  disjunctive  programs.  Consider 
Example  5,  let  S  =  {aV6}.  According  to  definition  11,  the  set  {c}  is  an  unfounded 
set  of  P  wrt  S',  but  {c}  is  not  an  unfounded  set  in  the  sense  of  Leone  or  Eiter. 

Having  the  new  notion  of  unfounded  sets,  we  are  ready  to  define  the  well- 
known  operator  Wp  for  any  disjunctive  program  P. 

If  P  has  the  greatest  unfounded  set  wrt  a  model  state,  we  denote  it  Up{S). 
However,  Up(S)  may  be  undefined  for  some  S.  For  example,  let  P  =  {a  V6}  and 
S  =  {a,b}.  Then  Xi  =  {a}  and  X2  =  {6}  are  two  unfounded  sets  wrt  S  but 
X  =  {a,  6}  is  not.  Here  we  will  not  discuss  the  operator  Up{S)  in  detail. 

Definition  12.  Let  P  be  a  disjunctive  program,  the  operator  Tp  is  defined  as, 
for  any  model  state  S, 

=  {A  €  DBp  I  there  is  a  rule  r  G  P  :  AV  ai  y  •  •  •  W  an  body{r) 

such  that  S  (=  body{r)  and  not  ai, . . . ,  not  an  E  S}. 

Notice  that  Tp{S)  is  a  set  of  positive  disjunctions  rather  than  just  a  set  of  atoms. 

Definition  13.  Let  P  be  a  disjunctive  program,  the  operator  Wp  is  defined  as, 
for  any  model  state  S, 


Wp{S)  =  rp{S)UnotUp{S). 
where  not. Up  {S)  ~  {not  p\pGUp{S)}. 

In  general,  Wp  is  a  partial  function  because  there  may  be  no  greatest  unfounded 
set  wrt  model  state  S  as  mentioned  previously. 

However,  we  can  prove  that  Wp  has  the  least  fixpoint.  Given  a  disjunctive 
program  P,  we  define  a  sequence  of  model  states  {Wk}k&j\r  where  Wb  =  0 
and  Wk  =  Wp(Hfc-i)  for  k  >  0. 

Similar  to  Proposition  5.6  in  [7],  we  can  prove  the  following  proposition. 
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Proposition  3.  Let  P  he  a  disjunctive  program.  Then 

1.  Every  model  state  Wk  is  well-defined  and  the  sequence  {Wk}ke^^  is  in¬ 
creasing. 

2.  the  limit  Uk>oWk  of  the  sequence  {Wk}keJ^  is  the  least  fixpoint  ofWp. 

Since  we  consider  only  finite  propositional  programs  in  this  paper,  there  is  some 
t>0  such  that  Wt  =  Wt+i. 

The  well-founded  semantics  U-WFS  is  defined  by 

U-WFS(P)  =-  lfp(>Vp). 

For  the  program  P  in  Example  5,  U-WFS (P)  =  {a  V  6,  not  c}. 

An  important  result  is  that  WFDS  (and  thus  D-WFS*)  can  also  be  equiva¬ 
lently  characterized  in  term  of  the  unfounded  sets  defined  in  this  section. 

Theorem  6.  For  any  disjunctive  program  P, 

WFDS{P)  =  U-WFS{P). 

Theorem  6  provides  further  evidence  for  suitability  of  WFDS  (equivalently, 
D-WFS*)  as  the  intended  well-founded  semantics  for  disjunctive  logic  programs. 
By  the  following  lemma,  we  can  directly  prove  Theorem  6. 

Lemma  4.  Let  P  be  a  disjunctive  program.  Then  Wk  =  Sk  for  any  k>0. 

This  lemma  also  reveals  a  kind  of  correspondence  between  the  well-founded 
disjunctive  hypotheses  and  the  unfounded  sets. 

7  Conclusion 

In  this  paper  we  have  investigated  recent  approaches  to  defining  well-founded 
semantics  for  disjunctive  logic  programs.  We  first  provided  a  minor  modification 
of  the  argumentative  semantics  WFDS  defined  in  [13].  Based  on  some  intu¬ 
itive  program  transformations,  we  proposed  an  extension  to  the  D-WFS  in  [2]. 
In  our  approach,  we  introduce  a  new  program  transformation  called  Head  re¬ 
duction.  This  transformation  plays  a  similar  role  in  DLP  as  the  GCWA  [9]  in 
positive  DLP.  We  have  also  given  a  new  definition  of  the  unfounded  sets  for  dis¬ 
junctive  programs,  which  is  a  generalization  of  the  unfounded  sets  investigated 
by  [7,5].  This  new  notion  of  unfounded  sets  fully  takes  disjunctive  information 
into  consideration  and  provides  another  characterization  for  disjunctive  well- 
founded  semantics.  The  main  contribution  of  this  paper  is  the  equivalence  of 
U-WFS,  D-WFS  and  WFDS.  We  have  also  provided  a  bottom-up  computation 
for  our  semantics.  A  top-down  procedure  is  presented  in  [14],  which  is  sound 
and  complete  with  respect  to  our  semantics.  These  results  show  that  there  ex¬ 
ists  a  disjunctive  well-founded  semantics  which  can  be  characterized  in  terms 
of  argumentation,  program  transformations,  unfounded  sets  and  resolution.  The 
fact  that  different  starting  points  lead  to  the  same  semantics  provides  strong 
support  for  WFDS.  Future  work  will  concentrate  on  more  efficient  algorithms 
and  applications. 
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Abstract.  This  paper  motivates  and  introduces  entailment  problems 
over  nonmonotonic  theories  some  of  whose  predicates — called  open  pred¬ 
icates — are  not  (completely)  specified.  More  precisely,  we  are  interested 
in  those  inferences  that  hold  for  some  or  all  possible  axiomatizations  of 
the  open  predicates.  Since  a  complete  specification  of  an  open  predicate 
may  model  incomplete  knowledge  about  the  world,  this  kind  of  inference 
should  distinguish  missing  object-level  knowledge  from  missing  parts  of 
the  specification,  and  restrict  nonmonotonic  inference  accordingly.  We 
formalize  some  interesting  forms  of  such  open  entailment  problems,  and 
provide  formal  proof  techniques  for  some  of  them  in  a  logic  programming 
framework. 


1  Introduction 

In  this  paper  we  tackle  the  problem  of  deciding  whether  a  given  formula  is 
entailed  by  a  nonmonotonic  theory  which  has  not  been  completely  specified. 
The  motivation  for  this  work  stems  from  several  applications  areas,  including 
the  following: 

—  Agent  programs  verification.  Given  a  logic-based  agent — such  as  an  IMPACT 
agent  [13] — it  may  be  necessary  to  verify  its  correct  behavior  by  proving 
that  certain  actions  will  never  be  executed,  or  that  some  action  will  surely 
be  taken  under  given  circumstances.  The  agent’s  actions  are  determined  by 
entailment  from  a  logic  program  whose  details  cannot  be  fully  specified  at 
verification  time  (e.g.,  the  precise  definition  of  the  agent’s  beliefs  and  goals 
would  most  likely  be  unavailable). 

—  Reasoning  about  actions  and  change  when  the  effects  of  some  actions,  or  the 
causal  links  between  certain  fluents,  have  not  been  specified  (e.g.,  because 
they  have  not  yet  been  identified). 

—  Security  policy  verification.  Security  policies  are  often  modelled  and  specified 
by  means  of  nonmonotonic  theories,  either  directly  [14,12]  or  indirectly,  by 
translating  the  specifications  into  logic  programs  with  negation  [2,4].  Part 
of  the  security  policy  may  be  unknown  [4],  e.g.,  because  it  is  to  be  decided 
by  a  different  organization,  or  because  it  is  subject  to  changes.  Thus,  some 
of  the  predicates  in  the  corresponding  logic  program  are  undefined  at  policy 
design  time.  Policies  should  be  verified  by  proving  that  certain  authorizations 
will/will  not  be  granted  (i.e.,  certain  atoms  will/will  not  be  derivable),  no 
matter  how  the  missing  details  are  filled  in  (see  [4]  for  further  details). 

T.  Eiter,  W.  Faber,  and  M.  Truszczyriski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  147-159,  2001. 

(c)  Springer- Verlag  Berlin  Heidelberg  2001 
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In  all  these  exanaples,  standard  nonmonotonic  semantics  would  treat  missing 
predicates  as  if  they  were  false  for  all  arguments.  Clearly,  this  is  not  appropri¬ 
ate  for  the  above  reasoning  tasks.  One  should  rather  consider  all  the  possible 
complete  definitions  of  those  predicates.  More  generally,  if  a  predicate  is  par¬ 
tially  specified,  all  the  complete  definitions  compatible  with  the  available  details 
should  be  considered.  In  classical  logic,  this  would  be  equivalent  to  proving  that 
a  certain  formula  is  a  logical  consequence  of  the  incomplete  specification.  In 
a  nonmonotonic  setting,  we  must  identify  hybrid  inference  mechanisms,  that 
lie  somewhere  in  between  classical  and  nonmonotonic  deduction.  In  particular, 
negation  as  failure  should  not  be  applied  to  any  predicate  whose  definition  is 
not  complete. 

We  start  a  formal  investigation  of  these  aspects  by  focussing  on  normal 
logic  programs  under  the  stable  model  semantics  (that  underlies — more  or  less 
explicitly — all  the  aforementioned  verification  problems).  Using  existing  termi¬ 
nology  [6],  by  open  program  we  mean  a  normal  logic  program  whose  domain  and 
predicates  are  not  completely  specified.  Section  3  formalizes  open  programs  and 
some  related,  interesting  inference  problems.  Section  4  introduces  provably  sound 
and  complete  techniques  for  solving  some  of  those  problems,  under  suitable  as¬ 
sumptions.  These  techniques  are  based  on  the  skeptical  resolution  calculus — that 
can  handle  open  domains — which  is  recalled  in  Section  2.  Section  6  concludes 
the  paper  with  a  list  of  interesting  open  problems  and  some  related  work. 


2  Preliminaries 

We  assume  the  reader  to  be  familiar  with  the  standard  notation  and  results  on 
logic  programming  [10]  and  the  stable  model  semantics  [8]. 

Let  metavariable  P  range  over  normal  logic  programs,  and  let  Ground(P) 
denote  the  ground  instantiation  of  P.  We  recall  that  a  support  of  a  ground 
atom  A  from  P  is  a  set  of  negative  literals  obtained  by  recursively  unfolding  A 
and  its  positive  subgoals  in  Ground(P),  until  only  negative  literals  are  left. 

In  the  main  part  of  the  paper,  the  skeptical  resolution  calculus  introduced 
in  [3]  will  be  adapted  to  open  entailment.  In  the  rest  of  this  section  we  recall  the 
basic  definitions. 

A  ground  countersupport  for  a  ground  atom  A  from  P  is  a  set  of  positive 
literals  K  such  that: 

1.  each  B  £  K  is  the  complement  of  some  literal  belonging  to  a  support  of  A 
from  P; 

2.  conversely,  each  support  of  A  from  P  contains  a  literal  whose  complement 
is  in  AT. 

A  (nonground)  countersupport  of  an  arbitrary  atom  A  from  P  is  a  pair  (K^6) 
such  that  for  ail  ground  instances  AOcr,  Ka  is  a.  ground  countersupport  of  AOcr. 

The  skeptical  resolution  calculus  is  formulated  independently  of  any  specific 
mechanism  for  computing  negation  as  failure.  Such  mechanism  is  abstracted  by 


Reasoning  with  Open  Logic  Programs  149 


a  function  CounterSupp  that  maps  each  atom  A  onto  a  (possibly  empty)  set  of 
nonground  countersupports  for  A. 

Let  P  be  an  arbitrary  given  program.  A  (simple)  goal  is  a  finite  sequence  of 
literals,  A  goal  with  hypotheses  {h-goal  for  short)  is  a  pair  {G  \  H),  where  G  is  a 
simple  goal  and  if  is  a  multiset  of  (positive  or  negative)  literals  called  hypotheses. 
Roughly  speaking,  the  answer  to  a  query  {G  \  H)  should  be  yes  if  G  holds  in  all 
the  stable  models  that  satisfy  if.  Finally,  a  skeptical  goal  (s-goal  for  short)  is  a 
finite  sequence  of  h-goals;  the  empty  sequence  is  denoted  by  □. 

A  skeptical  derivation  from  P  and  CounterSupp  with  restart  goal  Go  is  a 
(possibly  infinite)  sequence  of  s-goals  where  each  Gi-^i  is  obtained 

from  Oi  through  one  of  the  following  rewrite  rules  {P  and  A  are  sequences  of 
h-goals).^ 

Resolution.  This  rule  may  take  two  forms;  a  literal  can  be  unified  with  either 
a  program  rule  or  a  hypothesis.  First  suppose  that  Li  is  an  atom,  A  <— 
J3i , . . . ,  Rfc  is  a  standardized  apart  variant  of  a  rule  of  P,  and  6  is  the  mgu 
of  Li  and  A.  Then  the  following  is  an  instance  of  the  rule. 

r  {Li..,Li-Y,LuLi^i...Ln\H)  A 
[P  {Li . . .  Li^i,  jBi,  . . . ,  Bk,  Li+i . . .  Ln  I  Ff)  A]0 

Secondly,  let  Li  be  a  (possibly  negative)  literal,  let  L'  be  a  hypothesis,  and 
let  6  be  the  mgu  of  Li  and  L'.  Then  the  following  is  an  instance  of  the  rule. 

P  (Ti . . .  Lj—i,  Li^  Li-^-i . . .  Lji  j  Lf,  L  )  A 
[P  (Li...Li^uLi+i.,.Ln\H,L')  A]e  * 

Failure.  Suppose  that  Li  ~  -~>A,  and  ({Bi, . . . ,  Bk},0)  E  CounterSupp(A). 
Then  the  following  is  an  instance  of  the  Failure  rule. 

P  (Li . . .  Lj—i,  Li,  Lj-i-i . . .  Ln  I  H)  A 

[P  {Li . . .  Li_i,  Pi, . . .  ,Pfc,  Li+i  ,..Ln\H)A]6 

Contradiiction.  This  rule  tries  to  prove  (G  [  H)  by  showing  that  H  cannot  be 
satisfied  by  any  stable  model  of  P. 

PiG\H,L)A 

P{L\H,L)A' 

Split.  Essentially,  this  rule  is  needed  to  compute  floating  conclusions  and  dis¬ 
cover  contradictions.  It  splits  the  search  space  by  introducing  a  new  hy¬ 
pothesis.  Let  Go  be  the  restart  goal,  L  be  an  arbitrary  literal  and  a  be 
the  composition  of  the  mgus  previously  computed  during  the  derivation;  the 
Split  rule  is: 

P(G\H)  A 

P  (G|P,L)  {Goa\H,L)  A’ 

The  h-goals  (G  |  H,L)  and  {Goa  |  H,L)  are  called  restart  h-goals. 


^  The  restart  goal  Go  will  be  needed  in  the  Splitting  rule  below. 
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Success. 

r  {n\H)  A 
r  A 

A  skeptical  derivation  is  successful  if  the  last  s-goal  is  □ ;  in  this  case  we 
say  that  the  first  s-goal  Qq  has  a  successful  skeptical  derivation  from  P.  As 
usual,  the  composition  of  the  mgus  computed  during  the  derivation,  restricted 
to  the  variables  of  Qq  ,  is  called  answer  substitution.  Skeptical  resolution  is  sound 
and  complete  w.r.t.  the  skeptical  stable  model  semantics,  under  a  completeness 
assumption  over  CounterSupp  (see  [3]  for  further  details). 

There  exist  derivation  strategies  that  restrict  the  application  of  the  split 
rule.  Such  strategies  are  strictly  goal-directed  for  call-consistent  programs.  A 
prototype  implementation  based  on  a  semi-naive  metainterpreter  has  been  im¬ 
plemented  in  XSB  Prolog  (http :  //xsb .  sourcef  orge . net).  (Further  details  can 
be  found  in  the  journal  version  of  [3].) 

3  Open  Programs  and  Open  Entailment 

In  order  to  avoid  ill-formed,  possibly  paradoxical  definitions,  assume  two  fixed, 
infinite  sets  of  function  and  predicate  symbols  are  given,  and  denote  them  with 
Func  and  Pred,  respectively  (as  usual,  constant  symbols  are  identified  with  0-ary 
functions).  Moreover,  let  Var  be  an  infinite  set  of  variable  symbols  (following 
Prolog  s  conventions,  they  will  be  denoted  with  uppercase  letters).  ^Prom  now 
on,  we  shall  consider  only  normal  logic  programs  built  from  these  sets  (when  we 
write  “for  all  programs”  or  “there  exists  a  program”  we  implicitly  restrict  the 
quantification  accordingly). 

Intuitively,  an  open  program  is  a  partially  specified  program  P.  Some  pred¬ 
icates,  called  “open  predicates”,  are  not  completely  specified  in  P,  in  the  sense 
that  their  definition  might  be  completely  missing,  or  it  might  contain  only  some 
of  the  rules  that  define  the  predicate.  The  set  of  open  predicates  will  be  identified 
with  a  set  of  symbols  O  C  Pred.  Moreover,  the  missing  clauses  might  contain 
function  symbols  that  do  not  appear  in  P.  Such  symbols  are  listed  in  a  set 
F  C  Func. 

Definition  1  (Open  program).  An  open  program  is  a  triple  (P,F,0)  where 
P  is  a  normal  logic  program,  F  C  Func,  and  O  C  Pred.  The  symbols  in  F  should 
not  occur  in  P  (while  the  symbols  in  O  may  occur  in  P). 

For  any  given  open  program  ( P,  P,  O ),  an  open  atom  (resp.  literal)  is  any  atom 
(resp.  literal)  whose  predicate  belongs  to  O. 

The  next  definition  models  all  the  possible  ways  of  filling  in  the  missing 
details  of  an  open  program. 

Definition  2  (Open  program  completions).  Let  O  =  {P,F,0)  be  an  open 
program.  A  normal  program  P'  is  a  completion^  of  J?  if  the  following  conditions 
hold: 

^  The  word  “completion” ,  referred  to  normal  programs,  traditionally  denotes  Clark’s 
completion  [10].  Unfortunately,  the  author  could  not  find  any  suggestive  alternative. 
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1.  P'  2  P; 

2.  the  function  symbols  occurring  in  P'  but  not  in  P  belong  to  F; 

3.  for  all  r  e  P'  \P,  the  predicate  symbol  in  the  head  of  r  belongs  to  O. 

The  set  of  all  possible  completions  ofO  will  be  denoted  by  Comp(l7)  or,  equiva¬ 
lently,  by  Comp(P,  P,  O). 

Example  3.  Consider  an  open  program  Q  with 

P  =  {p{a,X)^^q{X)}, 

F  =  {b}, 

0  =  {q}. 

Some  of  the  completions  in  Comp(i7)  are: 

Pi  =PU  {?(«)}, 

P2=PU{5(6)}, 

P3  =  Pu{<?(X)^-p(x,y)}, 

P4  =  Pu{9(6)^-g(6)}. 

These  programs  differ  in  many  respects.  The  Herbrand  domain  of  Pi  and  P3 
coincides  with  the  domain  of  P,  while  the  domain  of  P2  and  P4  is  extended 
with  6.  Programs  Pi  and  P2  are  stratified  while  P3  and  P4  are  not.  Program  P3 
has  two  stable  models,  while  P4  has  no  stable  models.  □ 

In  the  context  of  a  given  open  program  (P,  P,  O),  a  ground  literal  is  a 
variable-free  literal  belonging  to  the  language  of  some  P'  G  Comp(P,  P,  O) — 
or,  equivalently,  any  ground  literal  built  with  the  symbols  occurring  in  P,  P 
and  O. 

We  are  ready  to  formalize  entailment  from  open  programs.  In  the  following, 
by  consistent  program  we  mean  a  normal  logic  program  with  at  least  one  stable 
model. 

Definition  4  (Open  inference).  For  all  open  programs  O  =  (P,  P,  O)  and 
all  first-order  sentences 

1.  (Credulous  open  inference)  O  \=^  'T  iff  for  some  P'  G  Comp(f2),  P'  credu¬ 
lously  entails 

2.  (Skeptical  open  inference)  f2  ^  iff  for  all  P'  €  Comp(l7),  P'  skeptically 
entails 

3.  (Mixed  open  inference  I)  Q  F  iff  for  some  consistent  P'  6  Comp(f7),  P' 
skeptically  entails  ^ . 

4-  (Mixed  open  inference  II)  O  ^  iff  for  all  consistent  P'  G  Comp(i7),  P' 
credulously  entails  F. 


Note  that  without  the  consistency  requirement  on  P',  mixed  open  inference 
would  be  trivial  in  most  cases.  It  is  often  possible  to  build  a  pathological  rule 
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p{a)  <—  -^p(a)  from  the  symbols  in  F  and  O,  and  obtain  an  inconsistent  P'  G 
Comp(i?),  Then  (without  the  consistency  requirement)  for  all  sentences  ^  we 
would  have  Q  ^  and  Q  F. 

The  four  forms  of  open  entailment  combine  two  aspects: 

-  The  quantification  on  P'  captures  the  kind  of  property  to  be  verified.  We 
may  be  interested  either  in  proving  that  in  some  case  something  happens 
(e.g.,  if  open  predicates  are  completed  in  certain  ways,  then  the  program 
may  do  errors)  or  that  in  all  cases  some  property  is  guaranteed  (e.g.,  the 
program  will  always  operate  correctly,  no  matter  how  missing  details  are 
fixed). 

—  Credulous  and  skeptical  stable  model  semantics  are  the  two  basic  semantics 
available  at  the  underlying  application  level. 

Example  5.  Consider  the  open  program  and  the  completions  illustrated  in  Ex¬ 
ample  3.  Since  p(a,  a)  is  true  in  the  unique  stable  model  of  P2,  then  we  have 
both  Q  \=^  p(a,  a)  and  Q  p(a,  a).  However,  p{a^  a)  is  not  in  the  stable  model 
of  Pi,  so  J?  p(a,  a).  The  sentence  q{a)  is  skeptically  entailed  by  Pi  and  P3, 
but  not  by  P2  (P4  is  ignored  because  it  is  inconsistent),  so  Q  q{a).  □ 

When  Q  is  not  intrinsically  inconsistent,  then  the  four  kinds  of  entailment 
can  be  compared  as  stated  by  the  next  proposition. 

Proposition  6.  Suppose  there  exists  a  consistent  P'  e  Comp(f2).  Then,  for  all 
sentences 

1.  ^  implies  Q  ^  and  O  T'; 

2.  Q  F  implies  Q  1=^^ 

3.  Q  1=^^^  F  implies  Q 

Thus,  we  get  a  lattice  of  entailment  relations,  where  skeptical  open  entailment 
is  the  strongest  and  credulous  open  entailment  the  weakest. 

There  is  also  a  duality  between  pairs  of  entailments,  which  is  helpful,  as  the 
four  inference  problems  can  be  reduced  to  only  two  problems. 

Proposition  7.  For  all  open  programs  Q  and  all  sentences 
L  Q\=^^  iff  f?  -iiZ'; 

2.  Q  iff  a  ^cs 

Therefore,  in  the  following  we  shall  focus  on  |=^  and  1=*^®.  They  are  based  on 
skeptical  inference,  that — unlike  credulous  approaches — does  not  need  the  pro¬ 
gram  to  be  instantiated  before  reasoning.  Since,  in  general,  the  set  of  terms  is 
not  exactly  specified,  such  instantiation  may  be  expensive  or  even  impossible 
(e.g.,  theoretically  speaking,  in  the  security  policy  verification  problem  the  set 
of  constants  has  no  fixed  a  priori  bound,  as  constants  correspond  to  user  names 
and  data  objects;  in  practice,  their  number  is  only  bounded  by  operating  system 
limitations,  and  is  considerably  high).  Currently,  it  is  not  clear  to  what  extent 
credulous  approaches  to  open  entailment  are  feasible. 
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4  Approaches  to  Skeptical  Open  Entailment 

The  skeptical  resolution  calculus  can  be  adapted  to  open  entailment  by  a  few 
modifications.  Some  of  them  essentially  state  that  open  literals  (both  positive 
and  negative  ones)  should  be  treated  like  negative  literals. 

Accordingly,  an  open  program  ( P,  F,  O )  is  range  restricted  if  each  variable 
occurring  within  an  open  or  negative  literal  in  the  body  of  a  rule  r  €  P,  occurs 
either  in  the  head  of  r  or  in  a  positive,  non-open  literal  of  r. 

Moreover,  an  open  support  for  a  ground  atom  A  w.r.t.  an  open  program 
=  (P,F,0)  and  P'  €  Comp (17),  is  a  goal  G  obtained  by  unfolding  A  in 
Ground(P'),  until  all  the  literals  in  G  are  either  open  or  negative. 

A  ground  open  countersupport  for  A  w.r.t.  Q  and  P'  is  a  set  of  ground 
literals  K  such  that 

1.  each  L  e  K  is  the  complement  of  some  literal  belonging  to  an  open  support 
of  A  w.r.t.  Q  and  P'; 

2.  conversely,  each  open  support  of  A  w.r.t.  i7  and  P'  contains  a  literal  whose 
complement  is  in  iC. 

In  the  open  setting,  the  Failure  rule  should  work  no  matter  how  the  Herbrand 
domain  can  be  extended.  This  requirement  leads  to  the  following  definitions. 

Let  a  P' -substitution  be  a  substitution  whose  range  is  contained  in  the  lan¬ 
guage  of  P'. 

A  (non-ground)  open  countersupport  of  an  arbitrary  atom  A  w.r.t.  Q  = 
{P,F^O)  is  a  pair  {K^6),  where  ^  is  a  P-substitution,  such  that  for  all  P'  e 
Comp(l7)  and  all  grounding  P'-substitutions  a,  Ka  is  a  ground  open  counter- 
support  for  AO  a  w.r.t.  Q  and  P'. 

Example  8.  Consider  again  the  open  program  of  Example  3.  Atom  p(a,  6)  has 
one  open  support,  ->9(6),  and  a  ground  open  countersupport  q(b)  (w.r.t.  P2 
and  P4).  Moreover,  p{Y,  A)  has  a  nonground  countersupport  ( {^(X)},  {y  =  fi} ) 
(according  to  the  intuition  that  p(a,X)  is  false  whenever  q{X)  is  true).  □ 

Example  9.  In  [4],  an  identically  empty  (or  inconsistent)  policy  template  is  pre¬ 
sented  as  an  example  of  verification  of  partially  specified  policies.  The  corre¬ 
sponding  open  program  (where  irrelevant  details  have  been  simplified  away)  has 
the  following  structure: 

P-{pi(X)^P2(X),-r(X), 

Pi(X)^P3(X),r(X), 

P2{X)^q{X),r{Xl 
P3{X)  qiX),-^r(X)} , 

F  =  an  infinite  set  of  identifiers, 

O  =  {q,r}. 

The  atom  pi{X)  has  the  following  open  countersupports,  where  e  denotes  the 
empty  substitution: 

(WX)},£),  ({-r(X)},e). 
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Note  that  countersupports  may  contain  negative  literals,  because  open  atoms 
are  treated  like  negative  literals  during  open  support  computation.  □ 

In  the  rest  of  the  paper,  we  assume  a  function  CounterSupp^  that  maps 
each  atom  A  onto  a  (possibly  empty)  set  of  (non-ground)  open  countersup¬ 
ports  for  A  w.r.t.  i?.  By  analogy  with  the  original  skeptical  resolution  calculus, 
CounterSuppj^  is  an  abstract  model  of  the  actual  implementation  of  negation 
as  failure  (possibly  including  issues  related  to  loop-checking,  or  tabulation  and 
delay),  largely  independent  of  particular  implementation  choices  (cf.  [3]). 

Definition  10  (OSK- Derivations).  An  open  skeptical  derivation  from  O  = 
(P,F,0)  ( OSK-derivation,  for  short)  is  a  skeptical  derivation  from  P  where  the 
Failure  rule  is  based  upon  CounterSuppQ,  and  is  never  applied  to  any  open  atom. 

Example  11.  Consider  again  the  open  program  and  the  open  countersupports 
illustrated  in  Example  9.  The  following  is  a  formalized  version  of  the  proof  that 
the  partially  specified  policy  is  inconsistent. 


(-Pi(^)  1  ) 

Failure,  using  ( {r(Ar)},  e ) 

(r-(X)  1  ) 

Split 

(r(X) 

r{X))  (-pi(X)  1  ^r(X)) 

Resol.  with  hyp. 

(D 

r(X))  (^pi(X)  1  MX)) 

Success 

(-Pi(X)  1  MX)) 

Failure,  using  ({-nr(A')},e 

(MX)  1  -r(X)) 

Resol.  with  hyp. 

(□  1  MX)) 

Success 

□ 

The  answer  substitution  is  empty,  which  means  that  Q  \/X.~^p{X)  (cf.  The¬ 
orem  13  below). 

Note  the  mix  of  negation  as  failure  (Failure  rules  and  countersupports)  and 
“classical”  reasoning  by  cases  (Split  rule),  that  considers  the  possible  values  that 
r{X)  may  take  in  different  completions. 

Space  limitations  do  not  allow  more  complex  examples.  Interested  readers  can 
find  a  complex  policy  for  a  hospital  in  [4],  together  with  the  translation  into  logic 
programs  and  a  clear  indication  of  what  predicates  are  to  be  left  open.  Several 
policy  verification  proofs  are  included.  They  are  all  open  skeptical  ent ailment 
problems.  □ 

The  completeness  of  open  skeptical  derivations  w.r.t.  open  entailment  de¬ 
pends  on  the  completeness  of  CounterSupp^^. 

Definition  12.  CounterSuppf2  is  complete  (w.r.t.  O)  if  for  all  ground  atoms 
A'y  and  all  ground  open  countersupports  K  for  Ay  {w.r.t.  Q  and  some  P'  e 
Comp(j?))  there  exist  an  open  countersupport  {K\6)  G  CounterSuppQ{A)  and 
a  substitution  a  such  that  AOa  =  Ay  and  K'a  =  K . 

The  following  theorem  states  that  open  skeptical  resolution  is  sound  and 
complete  for  open  skeptical  entailment.  Note  that  the  initial  goal  G  is  restricted 
to  the  language  of  P.  The  other  goals  cannot  be  inferred  with  open  skeptical 
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inference,  but  resolution  would  not  treat  them  properly.  For  example,  from  P  = 
{p(X)}  one  could  erroneously  derive  all  goals  p{a)  such  that  a  e  F. 

Theorem  13.  Let  G  be  a  simple  goal  whose  symbols  occur  in  P.  If  G  has  a 
successful  OSK-derivation  from  Q  with  answer  substitution  6,  then  Q  \/G6. 
Conversely,  if  CounterSupp^  is  complete  and  Q  \=^  Ga  for  some  grounding  cr, 
then  G  has  a  successful  OSK-derivation  from  fl  with  answer  substitution  6  more 
general  than  a. 

In  general,  implementing  a  complete  function  CounterSuppi^  is  a  nontrivial 
(and  sometimes  impossible)  task,  and  an  extensive  investigation  of  this  issue 
must  be  deferred  to  an  extended  version  of  the  paper.  However,  we  have  identified 
two  interesting  special  cases  where  completeness  can  be  easily  achieved: 

—  If  F  =  0  (i.e.,  the  Herbrand  domain  is  completely  specified),  then  computing 
open  countersupports  is  not  harder  than  computing  standard  countersup¬ 
ports  from  a  normal  program.  Open  supports  can  be  obtained  by  unfolding 
the  given  atom  A  in  Ground(P)  until  all  the  literals  are  either  open  or  nega¬ 
tive.  Ground  countersupports  can  then  be  obtained  by  collecting  one  literal 
from  each  support  and  negating  it.  This  basic  approach  can  be  optimized 
in  various  ways,  reducing  redundancy,  deriving  nonground  countersupports, 
etc. 

—  Suppose  (F,  F,  O)  is  range  restricted  and  generic,^  that  is,  the  terms  oc¬ 
curring  in  P  are  all  variables.  (This  is  precisely  the  kind  of  programs  we 
are  using  to  verify  the  policy  templates  introduced  in  [4],  cf.  Example  9.) 
Then  the  countersupport  construction  illustrated  in  the  previous  point  can 
be  carried  out  from  P  rather  than  Ground(F),  and  yields  a  provably  complete 
function  CounterSupp^. 

Example  14-  As  an  example  of  nonempty  open  predicate  specifications,  consider 
an  open  program  O  modelling  reachability  in  a  directed  graph: 


P:={e(X,v)^^e{v,X), 

(1) 

1{X,Y)  ^  e{X,Y), 

(2) 

l{X,Y)^e{Y,X), 

(3) 

r(X,X), 

(4) 

riX,Y)^liX,Z),r(Z,Y)}, 

(5) 

F  =  an  infinite  set  of  identifiers,  not  including  v, 
0  =  {e}. 


The  graph’s  edges  are  specified  by  the  open  predicate  e.  All  we  know  about  the 
graph  is  that  it  has  a  star-shaped  subgraph  with  central  node  I?,  which  is  directly 
connected  to  each  other  node  X  either  by  an  edge  {v,  A)  or  by  (A,  ?;).  The  other 
predicates  model  graph  connectivity  regardless  of  the  edges’  direction.  Predicate 

^  We  borrow  this  term  from  the  theory  of  database  queries. 
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holds  if  there  is  an  edge  between  X  and  Y,  in  some  direction.  Predicate  r 
is  the  reflective  and  transitive  closure  of  /.  We  can  prove  that  the  graph  is  strongly 
connected  (in  all  completions)  by  carrying  out  a  successful  OSK-derivation  for 
r{X,  Y)  with  empty  answer  substitution  (which  means  Q  |=®  'iXXY.r{X,  F)). 
The  derivation  is  the  following: 


) 

(l{X,Z\r{Z,Y)\  ) 
{eiX,Z),r{Z,Y)\  ) 
(e{X,Z),r(Z,Y)  |  e{X,Z))  {r{X,Y)  |  -e(X,Z)) 
{r{Z,Y)\e{X,Z))  {r{X,Y)  \  ^e{X,Z)) 

(□|e(X,K))  (r{X,Y)\^e{X,Y)) 
{r{X,Y)  I  -e(X,y)) 
{l{X,Z'),riZ',Y)l^{X,Y)) 
(e{Z\X),r(Z',Y)\^eiX,Y)) 
(MX,Z'),r(Z',Y)l^e(X,Y)) 

(r(F,y)  I  -e(X,y)) 
(□|-ne(X,y)) 


Resolution  with  (5) 
Resolution  with  (2) 
Split 

Resolution  with  hyp. 
Resolution  with  (4) 
(Z  =  Y) 

Success 

Resolution  with  (5) 
Resolution  with  (3) 
Resolution  with  (1) 
Resolution  with  hyp. 

{Z’  =  y) 

Resolution  with  (4) 
Success 


□ 


5  Restricted  Mixed  Open  Inference  of  Type  I 

A  general  approach  to  mixed  inference  is  still  an  open  problem.  In  this  paper 
we  sketch  a  preliminary  approach  that  applies  to  completely  undefined  open 
predicates  and  unbounded  domains.  More  precisely,  in  the  context  of  an  open 
program  ( P,  F,  O ),  we  assume  that  the  predicates  in  O  do  not  occur  in  the  head 
of  any  rule  of  P,  and  F  is  infinite. 

The  ground  open  resolution  calculus  should  be  extended  with  a  new  rule, 
called  abduction  rule: 

(Gi|Pi)...(G,-i  I  L,  G"  I  P,)(G,+i  I  . . .  (Gn  I  ifn) 

(Gi  I  L,Pi)...(G,-i  I  L,P,_l)(G^G"  I  L,P,)(G,+i  |  L, •  (Gn  |  L,H^) 

where  the  predicate  in  L  belongs  to  O,  and  under  the  restriction  that  each 
{L^Hi]  must  be  consistent  (1  <  2  <  n).  Intuitively,  open  predicates  can  be 
abduced  as  needed  to  complete  the  derivation. 

Simple  mixed  derivations  extend  OSK-derivations  with  zero  or  more  instances 
of  the  abduction  rule. 

Definition  15.  Simple  mixed  derivations  (SM- derivations,  for  short)  are  recur¬ 
sively  defined  as  follows: 

—  An  OSK-derivation  is  an  SM-derivation. 

~  U  Oo,  ‘  ■  <,Qn  is  an  SM-derivation  and  is  an  instance  of  the  abduction 
rule,  then  Qo, . . . ,  Gn+i  is  an  SM-derivation. 
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The  following  theorem  states  that  SM-derivations  are  sound  and  complete 
under  the  restrictions  stated  at  the  beginning  of  this  section,  and  the  further 
assumption  that  P  is  call-consistent. 

Theorem  16.  Let  Q  =  {P^F^O),  where  P  is  call- consistent,  F  is  infinite  and 
the  predicates  in  O  do  not  occur  in  the  head  of  any  rule  of  P.  Let  G  be  any 
ground  simple  goal.  If  G  has  a  successful,  ground  SM- derivation  then  Q  G. 
Conversely,  if  CounterSuppf2  is  complete  and  Q  1=*^*  G,  then  G  has  a  successful, 
ground  SM- derivation. 

Example  17.  Let  i?  be  defined  as  follows: 

P  =  {p(X)  ^ -.q(X),r{X), 
q{X)  ^  -p(X)}  , 

F  =  {a,b}, 

0  =  {r}. 

All  the  completions  entailing  ->r{a)  entail  also  q(a).  Accordingly,  there  exists  the 
following  ground  SM- derivation: 

{q{a)  I  )  Resolution  with  q{X)  -^p{X) 

(-^p(a)  I  )  Failure  with  {  {-T (a)},  e) 

(-ir(a)  I  )  Abduction  rule 
(□  I  -'r(a))  Success 
□ 

□ 

It  should  be  possible  to  remove  the  restriction  to  ground  derivations  by  keep¬ 
ing  all  the  goals  with  hypotheses  of  the  form  (□  |  H)  in  the  derivation  (e.g.,  by 
“turning  off”  the  Success  rule),  and  performing  a  final  check  that  all  such  H  can 
be  instantiated  to  consistent  sets  of  hypotheses  using  the  symbols  in  F. 

Example  18.  A  nonground  version  of  the  derivation  illustrated  in  the  previous 
example,  starting  with  {q{X)  \  ),  would  terminate  with  the  goal  (□  [  -ir(A')). 
Clearly,  the  hypothesis  can  be  consistently  instantiated  using  the  con¬ 
stants  in  F.  □ 

Example  19.  Let  P  =  {p{a)  q{X),-tq{a)}  and  O  —  {g}.  Consider  the  SM- 

derivation 

{p(a)  I  )  Resolution  with  p{a)  <—  q{X),  ^q{a) 

{q{X),-^q{a)  \  )  Abduction  rule 
(-«g(a)  I  g(A))  Abduction  rule 
(□  I  g(A),  -<g(a))  Success 
□ 


We  have  12  p{a)  iff  F  is  not  empty.  Accordingly,  the  final  hypotheses 
q{X),  ^q(a)  can  be  instantiated  to  a  consistent  set  iff  F  0.  □ 
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Similarly,  it  should  be  possible  to  remove  the  restriction  to  completely  unde¬ 
fined  open  predicates  by  closing  the  hypotheses  H  under  the  partial  definitions 
of  open  programs  during  the  final  check. 

The  final  check  is  in  fact  a  particular  constraint  satisfaction  problem.  Detailed 
solutions  to  this  problem  are  interesting  subjects  for  further  research. 

6  Final  Discussion  and  Related  Work 

The  definitions  of  open  programs  and  open  ent ailment  can  be  immediately  ex¬ 
tended  from  logic  programs  to  all  nonmonotonic  logics.  On  the  contrary,  the 
proof  techniques  based  on  skeptical  resolution  are  tailored  to  logic  programs. 
For  a  more  general  approach,  other  calculi  (and  extensions  thereof)  should  be 
considered  (e.g.,  [1,5,11]). 

At  the  current  stage  of  investigation,  we  see  no  appealing  way  of  approaching 
open  entailment  with  credulous  engines  or  calculi,  because  these  techniques  need 
to  instantiate  the  theory.  This  is  in  contrast  with  the  need  of  handling  (possibly 
unbounded)  open  domains.  On  closed  domains,  we  are  planning  an  experimental 
comparison  of  credulous  and  skeptical  approaches.  The  latter  might  be  more 
efficient  on  open  programs  due  to  their  goal-directed  nature,  that  might  focus 
proof  efforts  on  relevant  completions. 

The  theoretical  investigation  of  open  programs  and  entailment  is  still  in  a 
very  early  stage,  and  many  interesting  questions  are  to  be  answered.  More  work 
is  needed  to  obtain  more  general  solutions  to  the  entailment  problems.  Impor¬ 
tant  (and  partially  related)  issues  such  as  the  computational  complexity  of  open 
entailment,  expressiveness  (i.e.,  which  classes  of  properties  can  be  checked  via 
open  entailment  and  skeptical  resolution),  syntactic  restrictions  on  completions 
(e.g.,  restricting  completions  to  stratified  programs)  have  not  yet  been  explored. 

Moreover,  there  is  some  interesting  related  literature  whose  relationships  with 
our  work  have  not  yet  been  investigated. 

In  [9] ,  a  semantic  approach  to  reasoning  with  open  domains  was  introduced. 
In  the  most  optimistic  perspective,  open  skeptical  resolution  might  eventually 
be  adapted  to  reason  with  the  programs  introduced  in  [9].  Another  approach 
compatible  with  open  domains  can  be  found  in  [11].  Both  works  support  first- 
order  quantification. 

The  original  notion  of  open  programs  (e.g.,  [6])  adopted  a  fixed  underlying 
universe  and  was  introduced  for  characterizing  a  compositional  semantics  for 
logic  programs.  Some  of  those  results  could  be  of  use  for  understanding  inherent 
limitations  of  open  entailment. 

Later  on  [7]  the  term  “open  logic  program”  has  been  used  in  a  framework 
for  integrating  logic  programming  and  classical  first-order  logic.  There,  open 
predicates  are  those  defined  by  a  classical  first-order  theory.  The  alphabet  is 
fixed. 
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Abstract.  We  present  a  method  to  learn  simultaneously  definitions  for 
a  concept  and  its  negation.  This  problem  is  relevant  when  we  have  to 
deal  with  a  complex  domain  where  it  is  difficult  to  acquire  a  complete 
theory  and  where  we  have  to  reason  from  incomplete  knowledge.  We  use 
default  logic  to  represent  such  incomplete  theories.  This  paper  specifies 
the  problem  of  learning  a  default  theory  from  a  set  of  examples  and  a 
background  knowledge.  We  propose  an  operational  method  to  inductively 
construct  such  a  theory.  Our  learning  process  relies  on  a  generalization 
mechanism  defined  in  the  field  of  Inductive  Logic  Programming.  We  first 
consider  the  case  where  the  initial  knowledge  is  sure  because  it  contains 
only  ground  facts.  Then,  we  extend  the  framework  to  the  case  where  the 
initial  knowledge  is  a  default  theory. 


1  Introduction 

We  present  here  a  method  that  enables  to  construct  a  default  theory  from  a  set  of 
positive  and  negative  examples  and  an  initial  background  knowledge.  The  learn¬ 
ing  process  that  we  propose  is  strongly  related  to  research  realized  in  the  field  of 
Inductive  Logic  Programming  (ILP).  ILP  investigates  theory  and  methods  to  in¬ 
duce  first-order  clausal  theories  from  examples  and  background  knowledge  [18]. 
More  precisely,  in  the  normal  framework  of  ILP,  if  B  is  the  background  knowl¬ 
edge  and  E'^  and  E~  are  the  sets  of  positive  and  negative  examples  respectively, 
the  aim  is  to  induce  hypotheses  H  such  that  B  AH  \=  E'^  and  BaHAE~ 

In  most  cases,  E^  and  E~  are  examples  of  a  single  target  predicate  and  B  and  H 
are  definite  Horn  clauses  (but  some  systems  [6]  induce  full  first-order  theories). 
The  ILP  community  has  also  considered  the  problem  of  using  more  expressive 
formalisms,  specially  in  systems  that  construct  clauses  containing  the  negation 
as  failure  operator  [1,2,15]. 

The  problem  that  we  consider  in  this  paper  extends  [9]  and  concerns  the  si¬ 
multaneous  learning  of  definitions  for  a  predicate  p  and  its  negation  ~^p.  So  in  our 
framework,  negative  examples  for  a  predicate  p  will  play  the  full  role  of  leading 
to  explicit  definitions  of  The  relevance  of  this  approach  has  been  first  pointed 
out  by  De  Raedt  [5]  who  argued  that  the  closed  world  assumption  is  not  suited 
to  the  learning  paradigm  because  we  cannot  assume  that  everything  is  known. 
Our  proposition  follows  the  same  idea  and  is  concerned  with  the  construction 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  160-172,  2001. 
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of  theories  where  it  seems  difficult  to  apply  the  closed  world  assumption.  Let  us 
imagine  a  secretary- agent  that  must  learn  from  observations  when  it  must  pass 
on  a  phone  call  to  the  manager.  This  concept  seems  difficult  to  define  completely. 
So  it  is  a  good  representation  for  the  agent  to  define  explicitly  situations  where 
the  call  can  be  passed  on,  and  situations  where  the  manager  must  not  be  dis¬ 
turbed.  This  formalism  enables  the  agent  to  recognize  cases  where  the  concept 
remains  undefined  according  to  the  current  learned  theory.  A  situation  may  also 
be  undetermined  because  it  satisfies  at  the  same  time  a  positive  and  a  negative 
definition. 

In  order  to  give  explicitly  definitions  of  p  and  -^p  and  to  deal  with  possible 
inconsistencies  between  them,  we  propose  to  represent  the  learned  knowledge  by 
a  default  theory.  Default  logic  [22]  is  a  powerful  language  to  represent  incomplete 
knowledge,  which  enables  our  method  to  obtain  compact  theories  where  the 
relationships  between  the  definitions  for  p  and  -ip  appear  clearly.  In  default 
logic,  knowledge  is  represented  by  a  default  theory  (W]  jD),  where  W  is  a  set  of 
classical  formulas  (the  sure  knowledge)  and  is  a  set  of  default  rules  (or  defaults) 
that  represent  non  completely  specified  inference  rules,  often  considered  as  rules 
with  exceptions.  Formally,  a  default  ^:^has  a  consequent  7  and  two  types  of 
antecedents:  a  prerequisite  a  and  a  justification  0^ .  Then,  the  intuitive  meaning 
of  a  default  rule  is  :  “if  a  is  proved,  and  if  -i/3  is  not  deducible  (in  other  words  if 
/?  is  coherent)  then  conclude  7” .  In  whole  generality,  a,  /?  and  7  can  be  any  first 
order  logic  formula.  But  in  our  work,  they  will  be  formulas  with  free  variables, 
like  p(X,y),  so  our  defaults  are  said  to  be  open.  As  usual  in  default  logic,  each 
formula  p(X,  Y)  represents  the  set  of  all  ground  formulas  p{a,b)  that  can  be 
obtained  by  instantiation  with  the  constants  of  the  domain.  In  this  work,  we 
only  consider  finite  domains  (without  symbol  function)  and  then  our  set  of  open 
defaults  is  in  fact  a  compact  representation  of  a  finite  set  of  closed  defaults 
(without  free  variables)  obtained  by  instanciation  over  the  constant  set. 

We  recall  below  the  definition  of  an  extension  that  is  a  set  of  plausible  con¬ 
clusions  infered  from  a  given  closed  default  theory  (see  [22]  for  more  details  on 
default  logic).  A  default  theory  is  said  to  be  closed  if  all  its  defaults  are  closed. 

Definition  1.  [22]  Let  (W,  D)  be  a  closed  default  theory.  For  any  set  of  closed 
formulas  S,  let  r{S)  be  the  smallest  set  satisfying  : 

-WC  r{S) 

-  Th{r{S))  =  r{S) 

~  For  any  €  D,  if  a  e  r{S)  and  ->/?  ^  5,  then  7  €  r{S). 

A  set  of  closed  formulas  E  is  an  extension  of  (W,  D)  iff  E  =  F{E). 

A  fundamental  feature  of  default  logic  is  its  ability  to  represent  incom¬ 
plete  knowledge,  so  it  is  not  surprising  that  a  default  theory  may  have  mul¬ 
tiple  extensions  :  one  for  each  point  of  view  that  we  can  adopt  in  front  of  the 
missing  information.  For  instance  (W,J9)  =  ({a} ,  {^,  exten¬ 

sions  El  —  Th{W  U  {c})  and  E2  =  ThijW  U  {-'b}).  That  is  why  it  is  necessary 


^  If  5  is  a  default  rule,  pre{S),  jus (5)  and  cons{6)  respectively  denote  the  prerequisite, 
the  justification  and  the  consequent  of  S. 
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to  distinguish  between  skeptical  (or  cautious)  theorems  and  credulous  theorems. 
The  former  are  formulas  that  occur  in  every  extension  (c  V  ->6  in  our  previous 
example)  and  can  be  considered  as  sure  deductions.  The  later  are  formulas  that 
occur  in  at  least  one  extension  (c  in  our  previous  example)  and  are  only  hypo¬ 
thetical  conclusions.  As  it  will  be  described  later,  this  distinction  is  central  in 
the  paradigm  that  we  present  in  our  work. 

The  rest  of  the  paper  is  organised  as  follows  :  section  2  considers  default 
learning  in  the  case  where  the  initial  theory  does  not  contain  defaults.  Our 
methodology  is  illustrated  on  examples  in  section  3.  In  section  4,  we  develop  the 
more  general  framework  of  learning  with  an  initial  theory  that  contains  defaults. 
Then,  we  compare  our  work  with  other  approaches  in  section  5. 

2  Learning  Default  Theories 

2.1  Definition  and  Algorithm 

The  following  definition  formally  precises  the  framework  of  learning  a  default 
theory;  it  is  inspired  by  a  well  known  semantic  specification  of  ILP.  In  this 
section,  we  consider  the  special  case  where  the  initial  background  knowledge  is 
expressed  by  ground  facts. 

Definition  2.  Let  E'^  =  {p(ai), . . .  ,p(an)}  be  a  set  of  positive  examples  and 
E~  =  {-^p(ai), . . . ,  -'p(a^)}  a  set  of  negative  examples  of  a  target  predicate  p. 
Let  W  he  an  initial  set  of  ground  facts  containing  no  occurrence  of  p  or  ~^p, 
Learning  a  default  theory  for  the  concept  described  by  p  and  -yp  consists 
in  finding  a  default  theory  {W'^D')  such  that: 

-  D'  is  a  set  of  defaults,  the  consequents  of  which  are  p  or  ->p 

■  =  IV  U  Ep,  where  Ep  is  a  set  of  examples  that  cannot  he  generalized 

■  (AeGS+^)  ^  (AeGS-^))  skeptical  theorem  of  {W',D'). 

Definition  3.  An  example  e  {p{a)  or  ^'pia))  is  covered  by  a  default  theory 
{W,D)  if  e  is  a  credulous  theorem  of  {W,D). 

An  example  e  (p{a)  or  -^p{a))  is  an  exception  to  a  default  theory  {W,D)if 
->e  is  a  credulous  theorem  of(W,D). 

Our  approach  considers  that  the  training  examples  constitute  a  sure  knowl¬ 
edge  from  which  we  induce  default  rules.  As  a  default  theory  may  have  multiple 
extensions  mutually  inconsistent,  our  definition  requires  that  the  training  exam¬ 
ples  become  skeptical  theorems  of  the  induced  default  theory.  For  instance,  let 
~  {/^^6^(1)}»  E~  =  {-yflies{2)}  and  W  =  {bird{l),  bird{2),  penguin{2)}, 
and  let  D'  be  the  default  set  D'  =  ^  pengum(x) 

theory  {W,D')  does  not  satisfy  definition  2.  In  fact,  (W,D')  has  two  extensions 
El  =  Th{WU{flies{l)Jlies{2)})  and  E2  =  Th{WU{flies{l),-^flies{2)})  and 
consequently,  -yflies{2)  is  not  a  skeptical  theorem.  The  reader  can  easily  check 

that  if  we  take  D'  =  (  b.ird(jy)  :  flieslx)A-.penguin{X)  pen9uiniX):-.flies(X)  \  .v 
I  fhes{X)  ’  -^fliesiX)  j » 

{W,D')  is  a  solution  to  this  simple  learning  problem. 
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The  main  idea  is  to  give  symetric  roles  to  positive  and  negative  examples;  the 
positive  examples  are  used  to  build  defaults  defining  p  and  the  negative  exam¬ 
ples  are  used  to  build  defaults  defining  -yp.  Generalization  of  positive  examples 
leads  to  a  general  rule  defining  p  but  this  definition  may  admit  exceptions,  that 
are  found  by  examining  the  negative  examples.  Generalization  from  such  a  set  of 
exceptions  enables  to  specialize  the  rule  defining  p  and  moreover  gives  a  general 
definition  of  -yp.  A  symetric  treatment  is  also  applied  to  generalization  of  nega¬ 
tive  examples.  To  resume,  our  method  to  construct  a  set  of  defaults  alternates 
generalization  and  specialization  steps. 

The  generalization  process  is  based  on  a  generic  ILP  algorithm  named  here 
Gen(g,  £■*■,  T,  (p)  that,  from  a  set  of  positive  examples  of  the  predicate  q  and 
a  background  theory  T,  induces  one  definition  (p{X)  that  characterizes  a  part 
of  More  precisely,  that  means  that  the  theory  T  and  the  clause  {q{X)  : 
—(p{X))  enables  to  prove  all  the  examples  q{a)  generalized  by  (p  (see  section  3 
for  details). 

Formally,  the  algorithms  that  we  propose  are  the  followings. 


Algorithm  DefaultLearning 

In  :  p(X),W,E+,E-;  Out  :  W',D' 

Begin 

While  E+  _ 

Gen{p,  E+ ,  W,  (p)  searches  ip  that  generalizes  a  part  of  E+ 

If  a  formula  (p  is  found,  then 
Add  to  D'  the  default  S  = 

Remove  from  E+  the  examples  generalized  by  ip 
Exc  +—  {e  S  E~  I  e  is  an  exception  to(W' ,  D')} 

If  Exc  ,1^  0  then  Specialise(Bxc,  W,  6,  W',  D\  {pre(6)}) 

else  _  _ 

W*  *-W*  \JE+  S+  ^  0 

Endwhile 

JUS(X)  ♦—  A-ipre(6),  for  all  <5  €  D' s.  t.  cons{(5)  =  p(X) 

E~  <—  {e  €  E~\e  is  not  covered  by 

While  #  0 _  _ 

Gen(-<p,  E“,  W,  <p)  searches  (p  that  generalizes  a  part  of  E~ 

If  a  formula  <p  is  found,  then 
Add  to  D'  the  default 

Simplify  JUS(X)  =  by  removing  each  Ji{X)  s.t.  there  is  no  constant 

tuple  X  satisfying  p(X)  e  E'^  and  W  h  (p{X)  A  Ji(X) 

Remove  from  E~  the  examples  generalized  by  (p 

else  _  _ 

W'  ^W'UE-  E- 

Endwhile 

End 

Algorithm  Specialise 

In  :  Exc^W]  InOut  :  S,W\D';  In:  ForbForm 

Begin _ 

Exc  <—  Exc 

While  Exc  ^  0  _  _ 

Gen (-1  cons (<5),  Exc,  W,  ‘tf^Exc)  searches  V’Bxc  that  generalizes  a  part  of  Exc 
If  fpExc  is  found  and  'tpExc  ^  ForbForm  ,  then 
jus (5)  jus (5)  A  -<tpE  XC 
Add  to  the  default  Sexc  — 

Remove  from  Exc  the  examples  generalized  by  tpExc 

/*  E  stands  for  E"*"  (resp.  E“)  if  cons{6)  =  p(X)  (resp.  cons(S)  =  -‘p(X))*/ 

Excexc  {e  G  E|  e  is  an  exception  to{W' ,  D')} 
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If  Excbxc  ^  0  then  Specialise(Sa:cjsa;c}  Sexc,W* ,  D' ,  ForbForm  U  {pre(5£;a;c)}) 

else  _  _ 

W'  ^W'  U  Exc  £xc  f-  0 

Endwhile 

End 


In  our  main  algorithm  DefaultLearning,  p  stands  for  the  predicate  to  learn, 
E'^  and  E~  are  the  set  of  positive  and  negative  examples  of  the  concept;  W  is 
the  initial  theory;  W'  is  the  theory  W  which  may  be  augmented  by  some  exam¬ 
ples  that  cannot  be  generalized;  D'  is  a  set  of  defaults  the  consequents  of  which 
are  p  and  ->p  .  For  clarity,  the  algorithm  is  written  by  assuming  that  we  begin 
by  learning  p.  But  as  our  method  deals  with  positive  and  negative  examples  in  a 
symetric  manner,  it  could  as  well  begin  by  learning  -^p  by  exchanging  the  roles 
of  E^  and  E~. 

The  process  starts  by  a  generalization  step,  that  means  that  the  learning  al¬ 
gorithm  Gen  is  applied  to  E'^  in  order  to  compute  one  formula  (p  that  represents 
a  subset  of  E'^.  If  it  is  possible  to  find  such  a  formula,  the  default 
is  build  into  D'. 

This  default  6  may  admit  exceptions  (see  definition  3).  The  set  of  exceptions  is 
obtained  by  checking  for  each  ~ip{e)  in  E~  whether  p(e)  is  a  theorem  of  (PF',  D'). 
If  the  set  of  exceptions  is  not  empty,  we  must  specialize  S.  Gen  is  used  to  induce 
a  formula  ip  that  generalizes  these  exceptions  and  we  modify  the  default  S  into 
ip{X)  .p{jqA-^’iP{X}  ^  gy  jg  longer  applicable  to  the  negative 

examples  that  verify  ip{X).  At  the  same  time,  these  negative  examples,  gener¬ 
alized  by  ip(X),  lead  to  a  general  definition  of  -ip,  represented  by  the  default 
default  is  specialized  on  its  turn  if  it  is  necessary.  This  recursive 
process  always  ends  because  we  use  the  set  of  forbidden  formula,  ForbForm^  that 
avoids  possible  loops  in  the  situations  where  exceptions  and  examples  are  gener¬ 
alized  by  the  same  formula.  For  instance,  with  E'^  =  {flies{l),  flies{2)},  E~  — 
{->flies{3),-tflies{4)}  and  W  =  {6zVd(l),  bird{2),bird{3),bird(4:)},  we  obtain 
the  first  default  Si  =  which  has  the  exceptions  E~.  If  we  spe¬ 

cialize  Si  without  taking  into  account  ForbForm,  we  obtain  the  new  default 
^2  =  specialized  in  =  >>ird{X) fuZ{X)A-.bird{X)  _ 

This  is  not  acceptable  because  the  positive  examples  flies{l),flies{2)  are  no 
longer  covered  by  this  theory,  and  become  exceptions  to  ^2,  which  leads  to  a 
loop  in  this  recursive  specialization.  The  use  of  ForbForm  enables  to  find  the 


final  theory  {W  U  {^flies{3),->flies{4)},  | | ) ,  because  an  exam¬ 
ple  e  that  cannot  be  generalized  is  simply  added  to  W'  as  a  ground  fact.  This 
ensures  that  e  is  a  skeptical  theorem  of  (W',D'). 

When  all  the  positive  examples  are  generalized  (first  Endwhile),  we  check 
whether  there  are  still  any  negative  examples  not  covered  by  the  current  theory. 
If  it  is  the  case,  we  begin  to  complete  the  definition  of  -<p  by  a  similar  process. 
At  this  time,  all  the  positive  examples,  that  are  the  potential  exceptions  for 
defaults  defining  ~ip,  have  already  been  treated.  So  the  formulas  to  characterize 
these  possible  exceptions  have  already  been  computed:  they  are  the  prerequisites 
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of  some  defaults  defining  p.  That  is  why  all  the  new  defaults  that  are  introduced 
for  -ip  are  constrained  by  a  justification  JU S  that  is  the  conjunction  of  all  the 
prerequisites  of  all  defaults  concluding  p.  By  this  way,  we  avoid  the  computation 
of  exceptions,  which  is  an  expensive  process.  The  counterpart  of  this  strategy  is 
that  the  justifications  of  these  last  defaults  are  certainly  too  complex  and  they 
are  simplified  by  a  mechanism  that  checks  for  each  new  default  whether  these 
formulas  really  correspond  to  some  exceptions. 

A  last  point  to  notice  is  that  in  our  algorithm  the  sets  of  examples  that 
are  not  yet  covered  (£"+  and  E-)  decrease  each  time  that  Gen  generalizes  a 
subset  of  examples:  this  is  the  principle  of  iterative  covering  common  to  many 
learning  algorithms.  But  when  we  have  to  determine  the  exceptions  to  a  current 
theory,  we  must  take  into  account  the  initial  sets  of  examples  E'^  and  E~ ,  This 
is  necessary  to  be  sure  that  we  have  found  all  the  possible  exceptions. 


2.2  Correctness 

The  work  presented  here  extends  a  previous  method  [9]  that  concerned  only 
Lukaszewicz’  default  theories  where  the  existence  of  Ein  extension  is  guaranted. 
In  Reiter’s  default  logic,  this  point  must  be  more  carefully  studied. 

Theorem  1.  The  algorithm  DefaultLearning  induces  a  default  theory  that  has 
always  an  extension. 

Proof:  In  [14]  it  is  shown  that  a  Reiter’s  default  theory  has  at  least  one  ex¬ 
tension  if  its  block-graph  contains  only  even  cycles_^For  a  default  theory  (W,  D), 
the  block-graph  is  a  pair  (D,A).  The  vertex  set  D  contains  all  closed  defaults 
obtained  fi:om  D  except  those  that  are  incompatible  with  IF,  ie:  defaults  5  s.t. 
W  I — 'jus (6).  In  our  particular  case,  the  arc  set  A  contains  the  pair  if  ^ 

“blocks”  S',  i.e.:  W  h  pre{S)  and  W  U  cons{5)  h  ^jus{5').  By  cons^ction,  e^h 
induced  default  is  normal  or  semi-normal^  and  its  consequent  is  p{X)  or  -^p{X). 
So,  it  is  obvious  that  only  even  cycle  may  exist  and  then  our  learned  default 
theories  have  always  an  extension.  □ 

When  it  ends,  our  algorithm  guarantees  that  all  examples  are  covered  (each 
given  example  e  is  a  credulous  theorem)  and  that  there  are  no  remaining  excep¬ 
tions  (for  each  given  example  e,  -le  is  not  a  credulous  theorem).  So  we  have  to 
prove  that  it  is  sufficient  to  make  all  the  examples  skeptical  theorems,  as  it  is 
required  by  definition  2. 

Theorem  2.  Let  {W,D)  be  a  default  theory  induced  by  the  algorithm  Default- 
Learning  and  e  a  given  example. 

If  e  is  a  credulous  theorem  of  {W,  D)  and  ->e  is  not  a  credulous  theorem  of 
{W,D),  then  e  is  a  skeptical  theorem  of  (W,D). 

Proof:  Without  loss  of  generality  we  fix  that  e  is  a  positive  example  p{a)  (the 
proof  for  a  negative  example  is  similar)  such  that  p{a)  is  a  credulous  theorem 
and  "^p(a)  is  not  a  credulous  theorem.  Since  p{a)  is  a  credulous  theorem  it  means 

^  A  default  is  semi-normal  if  it  is  like 
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that  there  exists  a  dosed  default  S  =  (particularly  we  may  have 

P{a)~  true)  with  W  h  a{a). 

Let  us  suppose  that  p{a)  is  not  a  skeptical  theorem.  In  other  words,  there 
exists  an  extension  E  not  containing  p(a),  then  S  is  blocked  in  E  that  is  E  \- 
~>p{a)  V  Since  ~*p{a)  is  not  a  credulous  theorem,  it  is  never  possible  to 

obtain  -^p{a),  so  the  only  way  to  block  5  is  to  derive  ->/?(a).  ~i/?(a)  cannot  be 
obtained  by  a  default  S'  because  its  consequent  can  only  be  p{X).  So,  we  must 
have  ->^(a)  G  W .  But  in  this  case,  S  is  always  blocked  and  p{a)  is  not  a  credulous 
theorem.  This  contradiction  gives  our  result.  □ 

The  next  section  gives  examples  illustrating  our  methodology. 

3  Commented  Examples 

In  order  to  test  the  relevance  of  our  method,  we  have  simulated  its  main  steps 
on  some  artificial  examples.  It  is  fundamental  in  our  work  to  compute  general¬ 
ization  formulas  that  may  have  exceptions.  This  can  be  realized  in  ILP  systems 
(like  FOIL  [21]  for  instance)  by  allowing  a  certain  level  of  noise.  But  it  is  difficult 
to  adjust  this  parameter:  if  we  accept  a  high  level  of  noise,  we  find  too  general 
formulas,  if  the  level  of  noise  is  too  weak,  the  generalization  is  too  specific  or 
impossible  because  of  the  exceptions.  To  avoid  this  difficulty  that  must  be  stud¬ 
ied  carefully  for  each  application  domain,  we  use  a  generalization  mechanism 
that  rely  only  on  positive  examples.  The  ILP  system  Progol  [16]  has  the  ability 
to  learn  from  positive  data  only  [17]  and  we  use  it  as  the  generalization  tool 
described  by  the  function  Gen  in  our  algorithm.  Progol  is  an  ILP  system  based 
on  inverse  entailment.  The  input  file  for  Progol  specifies  the  set  of  positive  ex¬ 
amples  and  the  initial  background  theory  that  may  contain  definite  Horn  clauses 
but  also  integrity  constraints  expressed  by  headless  Horn  clauses.  Moreover,  the 
user  specifies  type  and  mode  declarations  for  the  predicates.  These  biases  are 
very  important  to  determine  the  space  of  possible  generalizations  that  Progol 
searches  with  an  A*-like  algorithm  in  order  to  return  a  clause  that  realizes  the 
best  data  compression. 

In  the  following  examples,  the  different  stages  of  our  method  have  been  simu¬ 
lated  by  switching  learning  steps  of  p  and  -ip.  When  learning  p,  the  theory  with 
only  £''*■  was  considered  and  in  order  to  learn  -ip,  the  negative  examples  are 
considered  with  -ip  renamed  in  an  ad-hoc  predicate  not-p.  The  covering  tests, 
that  are  necessary  to  determine  which  examples  are  not  yet  generalized  and  also 
to  determine  exceptions  to  a  default,  require  either  extension  calculus  or  query 
answering  in  Reiter’s  default  logic.  For  both  tasks,  operational  systems  exist  (for 
instance  DeRes  [4],  GADEL  [19],  XRay  [20]),  and  they  could  be  integrated  in  a 
whole  system  for  default  theory  learning. 
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W 


Example  1.  The  initial  theory  W  concerns  a  set  of  people  and  a  set  of  dishes^. 
/ife(l), /i?)(45),  /i6(46), /i6(50),  hb{bl), ^6(55)  ‘ 
v(46), v(50),  diab{61), diab{bb) 
a{mutton),a{beef)ja{fish), 
di{mutton),di{beef),  di{fish)^ 
oa{egg),oa{milk),  di{egg),di(milk), 
sug{ice-jcream)^  sug (cake) jdi{icejcr earn),  di{cake) 

The  aim  is  to  induce  what  people  eat  and  what  they  do  not  eat  from  the 
following  sets  of  examples. 

(  eats{2,egg),...jeats{b0,egg), 

=  <  eats{l, milk),. ..,eats(60, milk),  eats{l, mutton),.. .,eats{A5^mutton), 

\  eats{l,  beef), ...,  eats  (45,  beef),  eats{l,  fish), ...,  eats  {46,  fish) 
eats{l,egg), 

eats{46,  mutton), ...,  ->eats{60,  mutton), 
eats{46,  beef), ...,  ->eats{60,  beef), 
eats{46,  fish), ...,  -ieats(50,  fish), 
eats(61,  ice-cream), ...,  -ieats(55,  icejcream), 

[  "^eats{61,  cake), ...,  -'eats(55,  cake) 


"} 


E~  =  < 


Let  us  suppose  that  we  begin  to  learn  the  definition  of  eats{X,  Y).  So  we  run 
Progol  in  order  to  generalize  from  the  examples  E^  and  W.  The  best  clause  ac¬ 
cording  to  Progol  evaluation  is  {eats{X,Y)  hb{X),oa{Y)),  which  means  that 
all  the  persons  eat  dishes  that  have  an  animal  origin  (eggs  and  milk).  Prom  this 
formula  we  build  into  D*  a  first  default  <5i  =  Aoa^(y^^^t3(x,y)  ^  gy  exam¬ 

ining  the  set  of  negative  examples,  we  find  that  this  default  admits  only  one 
exception  ->eats{l,egg),  that  cannot  lead  to  a  relevant  generalization.  So  this 
exception  ->eats{l,egg)  is  added  to  W'.  There  are  still  some  positive  examples 
that  are  not  covered  by  {W',D')  and  a  second  ceill  to  the  generalization  of 
Progol  returns  the  clause  {eats{X,Y)  :-  hb{X),a{Y)).  So  we  build  the  default 
62  =  default  admits  a  set  of  exceptions  Excs2  = 

{~yeats{46,  mutton),  ...,-ieats{60,  fish)}.  In  order  to  characterize  these  excep¬ 
tions  by  a  general  formula,  we  submit  this  subset  of  examples  to  Progol  (after 
a  replacement  of  ~>eats  by  not-eats.  Progol  returns  the  clause  {not-eats{X,Y) 
A  a{Y)).  So  is  specialized  into  5^  =  :eaMx^y^A.W^)A.(y)l 

and  at  the  same  time,  we  build  (53  =  .  This  default  <53  admits 

no  exception. 

At  this  moment,  we  have  finished  the  covering  of  all  the  positive  examples  and 
we  consider  the  negative  examples  that  are  not  yet  covered  by  {W\  <^3})) 

namely  {-ieats(51,  icejcream), . . . ,  -'eats(55,  cake)}.  To  generalize  these  instan¬ 
ces,  Progol  finds  the  formula  {diab{X)  A  sug{Y))  and  we  build  a  default  64^  with 
this  formula  as  prerequisite.  To  take  into  account  the  whole  job  that  has  been  re¬ 
alised  during  the  learning  of  the  positive  part  eats{X,  Y),  this  default  has  a  justi- 

®  The  following  notations  are  used:  hb  stands  for  human.being,  v  for  vegetarian,  di  for 
dish  and  diab  for  diabetic;  a  for  animal  qualifies  dishes  that  are  animal  flesh,  and 
oa  for  animaLorigin  qualifies  dishes  that  have  an  animal  origin,  sug  qualifies  sugary 
food. 
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fication  which  is  the  conjunct  of  the  prerequisites  of  defaults  defining  eatsiX,  Y)\ 
^4  =  jLa,KX)A..p(y)  :  ^..t.(X,V-jA^.(hKX)Aoa(K))A^(/.KX)Aa(K))  _ 

5a  is  simplified  by  checking  that  there  does  not  exist  a  couple  {X,  Y)  such  that 
eats{X^Y)  G  E'^  and  {diah{X)^  A  sug{Y))  and  [hh[X)  A  oa{Y))  are  true  simul¬ 
taneously.  So  {hb{X)  A  oa{Y))  is  removed  from  the  justification  of  ^4.  The  same 
is  true  for  {hb{X)  A  a(y))  and  finally,  we  obtain  6'^  =  ■^eat3{x,Y)  ^ 

The  simplification  process  relies  on  theorem  proving  in  Horn  logic  and  is  much 
less  expensive  than  the  computation  of  exceptions  that  requires  theorem  proving 
in  default  logic.  One  can  easily  check  that  all  the  positive  and  all  the  negative 
examples  are  skeptical  theorems  of  {W\  {(^1,^2,  <^3, 

The  following  example  illustrates  that  a  learned  default  theory  may  have 
multiple  extensions. 

Example  2.  Let  us  consider  that  we  want  to  learn  the  predicate  with  W  = 
{q{bl),q{b2),  g(63),  q{nixon),r{t\),  r(t2),  r{nixon)},  E+  =  {p{bl),  p{b2),  p(63)} 
and  E  =  {-ip(^l),  -ip(t2)}. 

Let  us  note  that  nixon  is  not  given  as  a  positive  example  nor  as  a  negative 
one.  So  the  simplification  step  applies  to  the  second  default,  inducing  D'  = 
{ As  it  is  required  by  our  definition,  the  conjunct  of 
all  the  examples  is  a  skeptical  theorem  of  (W,  D')  even  if  this  theory  has  two 
distinct  extensions  Ei  -  Th{WU  {p(61),p(62),p(&3),  ^p(tl),  ^p(t2),p(mxon)}) 
and  E2  =  U  {p(61),p(62),p(63), -ip(tl), -ip(t2), -ip(mxon)}).  Knowledge 
about  nixon  remains  undefined  since  it  is  not  a  training  example. 


4  Learning  with  Initial  Defaults 

In  both  previous  sections  we  consider  special  default  theories  where  W  only  con¬ 
tains  ground  facts.  This  requirement  was  necessary  to  make  a  bridge  between 
default  logic  where  the  sure  knowledge  can  be  expressed  by  any  first  order  for¬ 
mula  and  ILP  where  the  initial  background  knowledge  is  expressed  by  Prolog 
clauses,  that  are  not  equivalent  to  implications.  In  the  case  of  the  example  1,  the 
whole  initial  theory  is  expressed  by  ground  facts,  whereas  some  general  Prolog 
clause  like  {hb{X)  u(Ar))  could  have  been  used.  Let  us  notice  that  such  an 
oriented  rule  could  be  written  in  default  logic  by  • 

We  consider  now  that  we  want  to  learn  a  new  concept  from  an  initial  de¬ 
fault  theory  and  a  set  of  examples.  The  initial  default  theory  may  have  multiple 
extensions,  but  this  difficulty  can  be  resolved  if  the  learning  process  relies  only 
on  the  sure  initial  knowledge.  That  is  why  we  propose  a  method  where  gener¬ 
alization  uses  a  background  knowledge  including  only  all  the  ground  facts  that 
are  skeptical  theorems  of  our  initial  theory.  This  new  learning  problem  can  be 
stated  as  followed. 

Definition  4.  Let  E'^  and  E~  be  positive  and  negative  examples  of  a  target 
predicate  p.  Let  {Wq^Dq)  be  an  initial  default  theory  containing  no  occurences 
of  p  or  -ip. 

^  r  stands  for  republican,  q  for  quaker  and  p  for  pacifist. 
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Learning  a  default  theory  for  the  concept  described  by  p  and  consists 
to  build  a  default  theory  {W'^D')  such  that: 

-  D'  =  DqU  Dp,  where  Dp  is  a  set  of  defaults,  the  consequents  of  which  are  p 

or  -'p 

-  W'  =  Wo  UEp,  where  Ep  is  a  set  of  examples  that  cannot  be  generalized 

-  (Ae€£+e)  A  (Aees-e))  is  a  skeptical  theorem  of  (W',D^). 

We  propose  the  following  method  to  induce  W'  and  D'  in  such  a  case.  First, 
we  compute  W  the  set  of  ground  facts  that  are  skeptical  theorems  from  the 
initial  default  theory  {Wq^Dq).  Then  the  algorithm  to  learn  the  default  theory 
{W',  D')  is  the  same  as  the  one  given  in  subsection  2.1,  except  the  two  following 
modifications: 

-  DefaultLearning  works  on  the  inputs:  p{X),W,  Wq,  Dq,  E'^,E~ 

-  the  two  first  initializations  W'  <—  W  and  Z)'  0  are  replaced  by 

W'  ^Wo  D'  ^  Do 

So  the  background  knowledge  used  for  generalization  by  Gen  is  always  W , 
the  set  of  skeptical  theorems  of  (Wq^Dq).  But  each  time  we  have  to  compute  a 
set  of  exceptions,  we  consider  the  exceptions  of  the  current  theory  {W',  D').  This 
current  theory  contains  the  initial  default  theory  (Wo,Z^o)  augmented  by  some 
new  defaults  defining  p  or  -ip  and  eventually  by  some  examples  that  cannot  be 
generalized.  So,  generalization  relies  on  sure  knowledge  but  the  search  of  excep¬ 
tions  takes  into  account  the  credulous  theorems  of  {Wo,Dq).  This  is  necessary 
to  insure  that  each  example  will  be  a  skeptical  theorem  of  the  resulting  default 
theory  {W',D'). 

Example  3.  Let  us  consider  the  initial  theory  (Wo,  Do)  with  Wq  =  {q(bl),  q{b2), 
q{nixon),  r{tl),  r{t2),  r{nixon),  usp{nixon),  p{john)}  and  Dq  —  { ’ 
“  {^o(61),  no(62),  no{john)}^  and  E~  =  {~>no{tl), 

-ino(t2),  -^no{nixon)} 

The  initial  theory  (Wo,  Do)  has  two  extensions  and  we  consider  only  the  set  W 
of  ground  facts  that  are  skeptical  theorems  in  order  to  learn  a  definition  of  no. 
As  W  =  Wo  U  {p(61),p(62),  ->p{tl),  -'p{t2)},  our  method  constructs  the  first  de¬ 
fault  •  The  negative  example  ->no{nixon)  is  an  exception  for 

Si  since  there  exists  an  extension  where  can  be  applied  to  nixon.  A  gener¬ 
alization  of  this  exception  leads  to  the  formula  usp{X).  Then  is  specialized 
into  5[  =  p(^)  ’  and  at  the  same  time  we  build  the  default  62  = 

exceptions.  The  learning  process  completes  the  defi¬ 
nition  of  ~^no  by  the  default  ^3  =  •  We  can  check  that  each  example 

is  a  skeptical  theorem  of  (W',D')  with  W'  =  Wo  and  D'  =  Do  U  {5i,52j<^3}- 
This  final  resulting  default  theory  (W',D')  has  two  extensions  because  of  the 
remained  incomplete  specification  about  p{nixon)  and  -^p{nixon).  But,  each  of 
these  extensions  contains  the  conclusion  -ino(nia;on)  as  it  is  required  by  our 
objective. 


®  no  stands  for  nuclear.opponent  and  usp  stands  for  US  President. 
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5  Related  Works 

The  problem  of  learning  non-monotonic  theories  by  learning  both  a  concept 
and  its  negation  has  been  pointed  as  very  interesting  for  many  years.  In  [5]  a 
concept  and  its  negation  are  effectively  learned,  but  in  the  framework  of  definite 
clauses  :  the  negative  concept  is  represented  by  a  new  predicate  notjp  and  the 
learning  algorithm  checks  that  no  contradiction  occurs  between  the  definitions 
of  p  and  notjp.  The  framework  proposed  in  [8]  learns  a  concept  and  its  exceptions 
by  means  of  general  rules,  and  conflicts  between  rules  are  solved  by  additional 
priority  relations.  This  framework  captures  the  notion  of  specificity  of  a  rule  as 
it  is  done  in  [3]  in  prioritized  default  logic.  But,  it  is  known  [23]  that  specificity 
can  be  handled  by  means  of  semi-normal  defaults  and  that  is  exactly  what  our 
method  does.  In  [7]  the  problem  of  contradiction  between  definition  of  p  and 
->p  is  solved  by  using  integrity  constraints  in  order  to  restrict  the  conclusions 
derivable  firom  too  general  rules. 

More  recently,  some  works  deal  with  this  problem  in  the  context  of  extended 
logic  programs  [11,13,12].  Extended  Logic  Programs  (ELP)  have  been  intro¬ 
duced  by  Gelfond  and  Lifschitz  [10]  to  extend  the  class  of  normal  logic  pro¬ 
grams  by  allowing  explicit  negation.  A  rule  in  an  ELP  has  the  form  Lq  <r- 
not  Ljn+i, . . . ,  not  Ln,  where  each  Li  is  a  literal  (positive  or  neg¬ 
ative).  [11,13]  propose  methods  to  learn  an  ELP  that  contains  a  definition 
of  p  and  a  definition  of  -»p.  Each  definition  may  have  exceptions  that  are  de¬ 
scribed  by  abnormality  predicates,  and  these  abnormality  predicates  are  de¬ 
fined  by  normal  clauses.  So,  the  aim  of  these  works  is  the  same  as  ours.  The 
main  difference  is  that  we  do  not  rely  on  abnormality  predicates  to  special¬ 
ize  overgeneral  rules.  For  instance,  the  algorithm  presented  in  [13]  learns  rules 
for  p  and  specializes  them  if  they  have  exceptions,  then  it  computes  on  the 
same  manner  a  set  of  rules  for  For  our  example  3,  the  following  rules 
P{^)  •  —qiX),not  ->p{X).  abl(X)  :-  usp(X).  ^p(X')  :  —r{X),notp{X). 

no{X)  :  -p{X),not  abl{X).  -.no(A)  :  -usp(X). 

are  learned.  We  can  observe  that  the  algorithm  has  dealt  twice  with  the  set  of 
US  presidents,  once  when  US  presidents  are  considered  as  a  characterization  of 
a61  and  another  time  when  US  presidents  are  considered  as  examples  of  the  con¬ 
cept  -»no.  This  illustrates  that  using  abnormality  predicates  to  specialize  rules 
hides  the  deep  relationships  that  exist  between  definitions  of  no  and  -mo  and 
this  leads  to  redundancy  in  the  learning  process  and  in  the  resulting  rules.  Fur¬ 
thermore,  the  complete  theory  induced  by  [13]  would  in  fact  transform  the  rule 
defining  no  into  the  two  rules  :  (no(X)  :  -p{X),not  abl{X),not  ^nd{X).)  and 
(no{X)  :  ~p{X),undefined{~>no{X)).)  and  similarly  for  the  other  rules  conclud¬ 
ing  ->no{X).  The  well  founded  semantics  requires  these  modifications  in  order 
to  deal  correctly  with  the  examples  where  the  definitions  of  no  and  -mo  overlap. 

The  method  we  have  presented,  like  those  described  in  [11,13,12],  is  a  method 
to  build  a  consistent  theory  in  a  non-monotonic  framework.  The  common  fea¬ 
ture  of  our  work  and  those  presented  in  [11,13,12]  is  to  rely  on  a  standard  ILP 
procedure  to  compute  definitions  for  the  positive  and  the  negative  parts  of  the 
concept;  then  we  use  a  theorem  prover  for  default  logic  (a  theorem  prover  for  the 
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answer  set  semantics  in  [11]  and  for  WFSX  semantics  in  [13,12])  to  compute  the 
potential  exceptions  to  the  definitions  that  have  been  induced.  So,  the  central 
point  is  to  study  for  each  used  semantics  how  to  agreggate  these  definitions  into 
non  monotonic  rules  able  to  deal  with  potential  contradictions. 

A  more  recent  work  [25,24]  presents  another  approach  where  the  induction 
of  hypotheses  is  realized  directly  from  the  answer  sets  of  the  initial  program. 
So  this  work  redefines  the  learning  process  accordingly  to  the  framework  used. 
In  the  case  of  a  background  program  having  multiple  answer  sets,  the  author 
proposes  to  learn  different  rules  for  each  answer  set,  which  is  very  different  from 
our  proposition  of  part  4.  Of  course,  further  study  and  also  experimentation 
of  those  formalisms  on  real  problems  are  necessay  to  decide  whether  induction 
must  rely  on  credulous  or  skeptical  knowledge. 


6  Conclusion 

We  have  presented  a  framework  to  induce  default  theories  from  training  exam¬ 
ples.  Default  logic  is  probably  the  most  general  framework  that  we  can  imagine 
to  represent  at  the  same  time  a  concept  and  its  negation  and  the  recent  tools 
realized  for  extension  calculus  or  query  answering  enable  to  consider  its  applica¬ 
tion  to  some  real  domains.  This  paper  has  shown  how  to  control  the  inductive 
construction  of  a  default  theory  to  insure  that  it  correctly  represents  the  knowl¬ 
edge  contained  in  the  training  examples.  The  availability  of  ILP  systems  allowed 
us  to  check  the  relevance  of  this  approach  on  some  artificial  examples.  We  have 
now  to  further  study  the  generalization  process,  specially  on  real  examples.  We 
think  that  our  method  can  be  the  basis  of  a  system  that  helps  a  user  to  formalize 
its  knowledge  in  default  logic. 
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Abstract.  This  work  concerns  the  use  of  default  knowledge  in  concept 
learning  from  positive  and  negative  examples.  Two  connectives  are  added 
to  a  description  logics,  C-CLASSIC,  previously  defined  for  concept  learn¬ 
ing.  The  new  connectives  (<5  and  e)  allow  to  express  the  idea  that  some 
properties  of  a  given  concept  definition  are  default  properties,  and  that 
some  properties  that  should  belong  to  the  concept  definition  actually 
do  not  (these  are  excepted  properties).  When  performing  concept  learn¬ 
ing  both  hypotheses  and  examples  are  expressed  in  this  new  description 
logics  but  prior  to  learning,  a  saturation  process  using  default  and  non 
default  rules  has  to  be  applied  to  the  examples  in  order  to  add  default  and 
excepted  properties  to  their  definition.  As  in  the  original  C-CLASSIC, 
disjunctive  learning  is  performed  using  a  standard  greedy  set  covering 
algorithm  whose  generalization  operator  is  the  Least  Common  Subsumer 
operator  of  C-CLASSIC^e.  We  exemplify  concept  learning  using  default 
knowledge  in  this  framework  and  show  that  explicitly  expressing  default 
knowledge  may  result  in  simpler  concept  definitions. 


1  Introduction 

The  general  aim  of  concept  learning  consists  of  inducing  hypotheses  from  a  set  of 
examples  of  an  unknown  target  concept.  The  choice  of  the  concept  (and  there¬ 
fore  hypothesis)  and  example  languages  is  very  important  in  this  framework. 
Inductive  Logic  Programming  (ILP,  [15,17])  studies  learning  within  the  frame¬ 
work  provided  by  clausal  logic.  However,  the  language  is  often  restricted  to  Horn 
clauses  for  complexity  reasons.  Description  Logics  (DLs)  are  other  restrictions  of 
first-order  logic^  in  which  the  subsumption  computation  and  its  complexity  have 

^  Note  that  comparing  the  expressive  power  of  DLs  and  restrictions  of  First  Order 
Logic  used  in  ILP  is  still  an  open  issue  [7]. 


T.  Eiter,  W.  Faber,  and  M.  Truszczyriski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  173-185,  2001. 
©  Springer- Verlag  Berlin  Heidelberg  2001 


174  Veronique  Ventos  et  al. 


been  deeply  studied  [11].  Several  ILP  approaches  presented  a  learning  framework 
where  the  learned  theories  or  the  entailment  relation  are  non-monotonic  (e.g. 
[1,9])  in  order  to  put  emphasis  on  the  problems  that  cannot  be  captured  by  clas¬ 
sical  definite  logic  programs.  We  propose  here  to  learn  in  a  framework  combining 
rules,  default  rules  and  a  description  logics  allowing  to  handle  default  knowledge. 
In  our  approach,  concept  learning  is  performed  in  two  steps.  First,  we  use  both 
default  rules  and  strict  rules  in  order  to  extend  the  definitions  of  the  examples 
by  adding  default  properties  and  excepted  properties.  Then,  concept  learning 
is  performed  using  subsumption  and  Least  Common  Subsumer  algorithms  that 
handle  examples  and  hypothesis  including  such  default  and  excepted  properties. 

The  description  logics  used  here,  C-CLASSIC<5e  [23,22],  extends  C-CLASSIC 
with  two  non  classical  connectives  ((5,e)  used  to  represent  default  knowledge.  C- 
CLASSIC  is  one  of  the  most  expressive  previously  known  tractable  DL,  which 
preserves  its  good  computational  properties  both  for  subsumption  and  PAC- 
learnability.  The  connective  6  intuitively  represents  the  common  notion  of  de¬ 
fault.  For  instance,  having  SViviparous  as  a  conjunct  in  the  definition  of  the 
concept  Mammal  states  that  mammals  axe  generally  viviparous.  The  connec¬ 
tive  e  is  used  to  represent  a  property  that  is  not  present  in  the  description  of 
the  concept  or  of  the  instance  but  that  should  be.  Thus,  for  instance,  being  a 
mammal,  an  ornithorynchus^  should  be  viviparous  since  mammals  are  generally 
viviparous.  However,  an  ornithorynchus  is  “exceptional”  w.r.t.  this  property  (i.e. 
it  has  the  property  Viviparous^)  as  it  is  an  oviparous  mammal. 

Default  and  excepted  properties  can  be  deduced  by  applying  default  rules  closely 
related  to  Reiter’s  normal  defaults.  For  instance,  the  Reiter’s  normal  default 
B'ird^):my{x)  as  ” if  o;  is  a  bird  and  if  it  is  consistent  that  x  can  fly 

then  infer  that  x  can  fly” .  Note  that  this  default  rule  handles  in  the  same  way 
particular  birds  that  cannot  fly,  as  penguins,  and  non  flying  animals  as  cats  for 
instance:  nothing  is  deduced.  In  our  framework,  the  rule  corresponding  to  the 
previous  normal  default  is  Bird{x)  — Fly(x)  which  is  interpreted  as  ”  if  a:  is  a 
bird  and  if  it  is  consistent  (i.e.  not  incoherent)  that  x  can  fly  then  infer  that  x 
generally  flies  ((^Fly(x)  is  inferred)  else  infer  that  x  is  exceptional  w.r.t.  Fly 
{Fly^{-x)  is  inferred).  In  this  framework,  we  infer  that  a  bird  generally  flies,  that 
penguins  are  exceptional  w.r.t.  Fly  and  nothing  is  deduced  concerning  cats. 
Such  default  rules  together  with  strict  rules  (i.e.  non  default  rules)  allowing 
to  express  incoherences  (e.g.  Fly  □  Inapt-to-fly  are  used  to  extend  the 

description  of  the  examples  prior  to  learning. 

As  in  the  original  C-CLASSIC,  disjunctive  learning  is  performed  using  a  stan¬ 
dard  greedy  set  covering  algorithm  whose  generalization  operator  is  the  Least 
Common  Subsumer  operator  of  C-CLASSIC^g.  The  computation  of  the  sub¬ 
sumption  relation  of  C-CLASSICje,  that  is  used  to  check  whether  a  hypothesis 
covers  an  example,  has  been  proved  to  be  correct,  complete  and  polynomial. 
Furthermore,  C-CLASSIC^e  is  PAC-learnable  [23,21,25]. 


^  Ornithoryncus  =  duck-billed  platypus. 
^  _L  is  used  to  denote  incoherences 


Explicitly  Using  Default  Knowledge  in  Concept  Learning  175 

This  paper  is  organized  as  follows.  Section  2  gives  some  needed  background 
information  on  the  C-CLASSIC^e  description  logics.  Learning  and  saturation 
process  in  C-CLASSIC^c  are  described  in  sections  3  and  4  together  with  a  com¬ 
parison  with  learning  in  C-CLASSIC.  Finally,  in  section  5  we  briefly  discuss 
related  work  in  DLs  and  ILP  fields  and  we  present  future  work. 

2  C-CLASSIC^, 

Description  Logics  are  a  family  of  knowledge  representation  formalisms  which 
stem  from  KL-ONE  [3].  Several  systems  have  been  built  based  on  DLs  (e.g. 
CLASSIC  [2],  FLEX  [18])  and  they  have  been  used  in  real-world  applications 
(e.g.  CLASSIC  in  [20]).  Besides,  DLs  facilitate  the  use  of  background  knowledge 
and  are  more  expressive  than  attribute- value  representations.  The  field  of  DLs 
has  received  increased  attention  over  the  recent  years  in  the  Machine  Learning 
community  (e.g.  [14,7,6]).  These  previous  approaches  used  terminological  lan¬ 
guages  unable  to  define  concepts  with  default  properties  whereas  allowing  for 
default  properties  in  concept  definitions  is  frequently  required  in  applications 
where  few  concepts  can  be  strictly  defined  with  necessary  and  sufficient  proper¬ 
ties  [12]. 

In  DLs,  a  concept  is  defined  as  a  set  of  properties  satisfied  by  individuals  that 
are  instances  of  the  concept.  These  properties  are  expressed  by  terms  that  are 
built  from  atomic  concepts  and  roles  and  from  a  set  of  connectives.  Concepts 
are  partially  ordered  by  a  subsumption  relation  which  expresses  the  inclusion 
relation  between  concepts  and  is  usually  based  on  a  standard  model-based  log¬ 
ical  semantics.  The  subsumption  relation  in  C-CLASSIC^e  which  is  central  for 
the  learning  task  is  presented  in  section  2.2.  Knowledge  is  mainly  separated  into 
two  components:  a  terminological  component  (T-box)  which  contains  the  defi¬ 
nition  of  concepts  and  an  assertional  component  (A-box)  containing  statements 
about  individuals.  We  assume  here  that  the  A-box  is  empty  since  we  represent 
the  examples  using  the  terminological  language  presented  in  section  2.1  (see  sec¬ 
tion  3  for  more  details  about  examples).  Section  2.3  presents  the  Least  Common 
Subsumer  operation  in  C-CLASSIC^e  which  is  the  generalization  operator  used 
during  the  learning  process. 

2.1  Terminological  Language 

The  connectives  of  C-CLASSIC^e  are  the  connectives  of  C-CLASSIC  [7]  plus  the 
connectives  5  and  e  introduced  in  ACs^  [8].  The  terminological  language  of  C- 
CLASSIC^e  is  defined  using  a  set  R  of  atomic  roles,  a  set  P  of  atomic  concepts, 
the  constants  T  and  X,  a  set  I  of  individuals  (called  classic-individuals)^  and  the 
following  syntactic  rule  {C  and  D  are  concepts,  P  is  a  atomic  concept,  P  is  a 
atomic  role,  u  is  a  real,  n  is  an  integer  and  li  are  classic-individuals): 


T 

the  most  general  concept 

u 

the  most  specific  concept 

atomic  concept 
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i  ONE-OF  {7i.../n} 

1  MINu 

1  MAXu 

\cnD 

\>/R:C 

t  R  FILLS 

i  R  AT-LEAST  n 

t  R  AT- MO  ST  n 

1  SC 

I 


concept  in  extension 
u  is  a  real  number 
u  is  a  real  number 
concept  conjunction 
value  restriction 
subset  of  values  for  R 
cardinality  for  R  (minimum) 
cardinality  for  R  (maximum) 
default  concept 
exception  to  the  concept  C 


This  syntactic  rule  is  used  to  define  terms  of  C-CLASSIC^e. 

Example:  Student  H  6 {publications  AT-LEAST  4)  H  \/age:MAX  21  r\  publications 
FILLS  {JAIR,AI}  n  ^publications:  {^year:  ONE-OF {97, 98, 99})  describes  all 
the  students  who  generally  have  at  least  four  publications,  are  less  than  27  years 
old,  have  at  least  one  publication  in  JAIR  and  AI,  and  whose  publications  have 
been  published  in  the  years  97,  98  or  99. 

Defining  a  concept^  means  giving  a  name  A  to  a  term  T  of  the  C-CLASSIC^e 
language  using  the  expression  A  =  C. 

Example:  Mammal  =  Animal  □  6Viviparous  □  Vertebrate 
A  T-box  of  C-CLASSIC,Se  is  composed  of  concept  definitions. 


2.2  Subsumption  in  C-CLASSIC^c 

In  DLs,  concepts  are  organized  in  a  taxonomy  via  a  subsumption  relation.  Con¬ 
cerning  the  strict  part  of  C-CLASSIC<5e,  subsumption  in  C-CLASSIC^e  is  equiv¬ 
alent  to  subsumption  in  C- CLASSIC.  More  precisely,  a  concept  C  is  subsumed 
by  a  concept  D  if  C  has  (explicitly  or  implicitly)  all  properties  of  D.  In  our 
framework,  we  must  distinguish  strict  and  default  properties.  Roughly  speaking, 
a  concept  C  is  subsumed  by  a  default  property  if  its  definition  contains  either 
the  default  property,  the  strict  property  or  the  excepted  property.  For  instance, 
6Fly  subsumes  concepts  having  explicitly  or  implicitly  either  SFly,  Fly  or  Fly^ 
in  their  definition,  while  concepts  whose  definition  does  not  mention  anything 
(strict,  default,  exception,  exception  of  exception  . . .)  about  Viviparous  are  not 
subsumed  by  SViviparous. 

Example  : 

Bird  =  Animal  □  Has-  Wings  D  SFlies  (a  bird  generally  flies) 

Penguin  =  Animal  □  Has-Wings  PI  5{Flies^)  (a  penguin  is  generally  exceptional 
w.r.t.  Flies)  fl  SInapt-to-fly  (a  penguin  is  generally  inapt  to  fly) 

SuperPenguin  =  Animal  PI  Has-Wings  PI  {Flies^Y  (a  Superpenguin  is  an  ex¬ 
ception  to  an  exception  since  it  is  an  exceptional  Penguin)  PI  Inapt-to-fl-if  (a 
SuperPenguin  is  exceptional  w.r.t.  Inapt-to-Fly  since  it  can  fly) 

With  these  definitions,  Bird  subsumes  Penguin  and  SuperPenguin  {SFlies 
both  subsumes  S{Flies^)  and  {Flies^)^).  SuperPenguin  is  subsumed  by 
Penguin  {5{Flies^)  subsumes  {Flies^Y  and  S  Inapt-to-fly  subsumes  Inapt-to- 
fl'if).  Note  that  if  Bird  and  SuperPenguin  were  defined  with  the  strict  prop¬ 
erty  Fly  and  Penguin  with  the  strict  property  Inapt-to-fly,  Penguin  would  no 

^  Note  that  cyclic  concept  definitions  are  not  allowed. 
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more  be  subsumed  by  Bird  and  SuperPenguin  would  no  more  be  subsumed  by 
Penguin. 

More  formally,  let  C  and  D  be  two  elements  of  G-CLASSIC<5€,  CUD,  i.e.  D 
subsumes  C,  iff  C  satisfies  the  strict  properties  of  D,  and  satisfies  or  is  explicitly 
“exceptional”  w.r.t.  the  default  properties  of  D. 

The  definition  of  the  subsumption  of  C-CLASSIC5e  is  based  on  an  “equational 
system”  fully  defined  in  [22]  called  EQ.  EQ  is  a  set  of  axioms  defining  the  main 
properties  of  the  C-CLASSIC<5e  connectives  (e.g.  the  axiom  AU  B  —  B  V\  A 
expresses  the  commutativity  of  concept  conjunction,  the  axioms  A  n  5 A  =  A 
and  A^r\8A  =  A^  express  a  subsumption  relationship  between  A  and  6 A  {A  is 
subsumed  by  6 A)  and  between  A^  and  6 A  {A^  is  subsumed  by  5A),  the  axiom 
65 A  =  6 A  expresses  the  idempotence  of  5). 

Let  =eq  denote  the  equality  (modulo  EQ  axioms)  between  two  terms  of  C- 
CLASSIC<5e-  Subsumption  in  C-CLASSIC<5e  is  defined  as  follows. 

Definition  1  (Subsumption)  Let  C  and  D  be  two  elements  of  C-CLASSICsc, 
C  ^  D,  i.e.  D  subsumes  C,  iff  C  r\  D  =bq  C. 

In  DLs,  the  subsumption  computation  (for  instance  C  U  D)  is  performed  in  two 
steps.  First  C  and  D  are  expanded  (i.e.  their  definition  is  then  exclusively  made 
up  of  atomic  concepts  and  roles).  This  expansion  step  allows  us  to  take  into 
account  the  background  knowledge  linked  to  the  T-box.  Then,  a  subsumption 
algorithm  is  applied  on  them. 

In  [23,22],  a  polynomial-time,  complete  and  correct  subsumption  algorithm  based 
on  the  equational  system  has  been  designed  for  C-CLASSIC  and  C-CLASSIC^g. 
This  algorithm  computes  normal  form  of  concepts  according  to  the  equational 
system,  this  normalisation  step  will  be  used  during  the  saturation  process.  This 
subsumption  is  not  a  pure  syntactic  relation  like  ^subsumption.  It  is  a  seman¬ 
tic  relation  like  logical  implication  or  generalized  subsumption  [4].  Indeed,  the 
subsumption  takes  into  account  the  whole  T-box  which  expresses  a  kind  of  back¬ 
ground  knowledge.  In  other  words,  the  subsumption  relation  corresponds  to  log¬ 
ical  implication  within  C-CLASSIC<5e- 


2.3  Least  Common  Subsumer  in  C-CLASSIC^c 

As  mentioned  above,  learning  in  C-CLASSIC^e  relies  both  on  the  subsumption 
relation  and  on  the  computation  of  the  Least  Common  Subsumer  (LCS)^  of  two 
concept  definitions.  The  definition  of  the  LCS  in  C-CLASSIC^e  is  as  follows: 

Definition  2  (LCS  in  C-CLASSIC^,)  LCS(A,B)  C  €  C-CLASSICse  if 
and  only  if  A  Q  C  and  B  Q  C  (C  subsumes  both  A  and  B),  /3  D,  D  E  C- 
CLASSICse  such  that  A  Q  D,  B  Q  D  and  D  is  strictly  subsumed  by  C. 

An  LCS  algorithm  has  been  designed  for  C-CLASSIC^e  in  [23,24].  It  has  been 
proved  that  this  algorithm  is  correct  and  polynomial,  and  that  the  LCS  is  unique. 


®  In  the  framework  of  DLs,  the  notion  of  Least  Common  Subsumer  has  been  introduced 
by  Borgida,  Cohen  and  Hirsh  in  [5]. 
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Example: 

C  =  Animal  fl  Vertebrate  □  With-heak  □  Oviparous  fl  Has-teats  fl  Viviparous^ 
n  VweightiMIN  20  fl  Vage:MAX  15 

D  =  Animal  D  Vertebrate  fl  Has-teats  fl  Viviparous  n  Vweight:MIN  10  H 
'^age.-MAX  10 

LCS(C,D)  ==  Animal  fl  Vertebrate  fl  Has-teats  □  SViviparous  R  \/weight:MIN 
10  n  Vage:MAX  15 

3  Learning  in  C-CLASSIC 

Cohen  and  Hirsh  [7,6]  give  theoretical  and  experimental  results  on  the  learn- 
ability  of  description  logics.  In  particular,  they  prove  that  C-CLASSIC  is  PAC- 
learnable. 

Prom  a  practical  point  of  view,  the  authors  propose  several  algorithms  allowing 
to  learn  concepts  of  C-CLASSIC  from  positive  and  negative  examples  of  these 
concepts.  The  language  of  both  concepts  and  examples  is  the  terminological  lan¬ 
guage  of  the  DL. 

The  covers  relation  that  specifies  how  hypotheses  relate  to  examples  is  the  sub¬ 
sumption  relation:  an  hypothesis  H  covers  an  exemple  e  if  and  only  if  e  C  if 
(i.e.  H  subsumes  e).  The  aim  is  then  to  find  a  hypothesis  H  that  covers  all  pos¬ 
itive  examples  (completeness)  and  none  of  the  negative  examples  (consistency). 
As  C-CLASSIC  only  contains  a  limited  kind  of  disjunction  (the  ONE-OF  con¬ 
nective)  ,  many  target  concepts  of  practical  interest  cannot  be  expressed  using  a 
single  term  of  C-CLASSIC.  One  way  to  overcome  this  limitation  is  to  consider 
algorithms  which  learn  a  disjunction  of  terms  rather  than  a  single  term,  i.e.  a 
hypothesis  H  such  that  iJ  =  Hi  V  H2  . . .  V  Hn  (however,  note  that  the  con¬ 
nectives  J  et  e  allow  to  limit  the  number  of  disjunct s  used  to  represent  concepts 
(see  section  4.3)).  The  cover  of  an  example  is  then  as  follow  : 

If  =  Hi  V  H2  . . .  V  Hn  and  e  is  a  concept,  then  H  covers  e  if  and  only  if  3 
Hi,  e  C  Hi  (i.e.  Hi  subsumes  e) 

The  basic  idea  behind  the  LCSLEARNDISJ  algorithm  described  in  [7]  is  to  use 
the  LCS  to  implement  a  specific-to-general  greedy  search  for  hypotheses  that 
cover  many  positive  examples  and  no  negative  examples  (this  approach  is  simi¬ 
lar  to  GOLEM  [16]  where  multi-clause  Prolog  predicates  are  learned). 

Example  1 

Let  E+={  ei,  62,  es,  e4}  be  a  set  of  positive  examples  of  the  concept  to  learn  and 
E“={cei}  a  set  of  negative  examples  of  this  concept. 

Cl  =  Animal  R  Viviparous  R  Vertebrate  R  Barks. 

€2  =  Animal  R  Vertebrate  R  Oviparous  R  Has-teats. 

63  =  Animal  R  Vertebrate  R  Flies  R  Quacks. 

e4  =  Animal  R  Vertebrate  R  Lives-in-Antartica  R  Has-Wings  R  Inapt-to-fly. 
ce’i  =  Animal  R  Vertebrate  R  Lives-in-the-sea  R  Scales. 

Results  : 

LCSLEARNDISJ  computes  the  Least  Common  Subsumer  of  various  subsets  of 
positive  examples.  Since  all  the  computed  LCS  (e.g.  Animal  R  Vertebrate)  cover 
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the  negative  example,  no  consistent  generalization  can  be  performed.  As  a  conse¬ 
quence,  LCSLEARNDISJ  returns  the  disjunction  of  the  description  of  the  four 
positive  examples:  {Animal  fl  Viviparous  n  Vertebrate  H  Barks)  V  {Animal  R 
Vertebrate  R  Oviparous  R  Has-teats)  V  {Animal  R  Vertebrate  R  Flies  R  Quacks) 
V  {Animal  R  Vertebrate  R  Lives-in-Antartica  R  Has- Wings  R  Inapt-to-fly). 

In  this  example,  we  can  see  that  for  instance  ei  and  e2  have  more  in  common  than 
Animal  R  Vertebrate  since  ei  is  viviparous  and  62  has  teats.  However,  the  rela¬ 
tionship  between  Viviparous  and  Has-teats  can  not  be  expressed  in  C-CLASSIC 
since  it  is  not  a  strict  knowledge  (i.e.  it  is  neither  true  that  all  animals  that  have 
teats  are  viviparous  nor  that  all  animals  being  viviparous  have  teats)  and  we 
can  not  add  Viviparous  to  e2  since  it  is  oviparous  (the  same  problem  appears 
with  63  that  flies  and  e4  which  has  wings  but  which  is  inapt  to  fly).  In  other 
words,  62  and  64  have  exceptional  properties  but  C-CLASSIC  does  not  allow  to 
express  these  exceptional  properties.  We  show  in  section  4  how  the  saturation 
process  allow  to  learn  a  more  suited  concept  in  C-CLASSIC<5e  without  explictly 
expressing  the  exceptional  properties  of  62  and  64. 

4  Lecirning  in  C-CLASSIC^c 

Learning  in  C-CLASSIC^e  is  similar  to  learning  in  C-CLASSIC.  C-CLASSIC^e 
has  been  proved  PAC-learnable  [23,21].  The  same  LCSLEARNDISJ  algorithm 
can  be  used  as  polynomial  subsumption  and  Least  Common  Subsumer  algo¬ 
rithms  have  been  defined  on  C-CLASSIC^e. 

However,  the  example  definitions  have  to  be  saturated  prior  to  learning  using 
background  knowledge.  A  part  of  this  background  knowledge  is  related  to  default 
and  excepted  properties  and  it  is  used  to  add  such  properties  in  the  positive  and 
negative  examples.  The  learning  problem  for  our  framework  is  therefore  defined 
as  follows: 

Given:  a  set  of  T-box  statements,  a  finite  set  of  rules®  /?  (background  knowl¬ 
edge),  and  sets  of  C-CLASSIC<5e  concepts  E^,  E"  (positive  and  negative  exam¬ 
ples). 

Build:  sets  of  saturated  examples  E*^’  and  E“’  (E+’  and  E“’  are  the  result  of 
the  saturation  process  linked  to  j3  and  applied  on  E"^  and  E~). 

Find:  a  hypothesis  H  (disjuncts  of  C-CLASSIC<je  terms),  such  that  H  is  complete 
w.r.t.  E"^’  and  consistent  w.r.t.  E~’. 

4.1  Background  Knowledge 

The  background  knowledge  (3  is  composed  of  two  sets  of  rules  (C  and  D  are 
terms  of  C-CLASSIC^e):  a  set  Defoi  default  rules  in  the  form  C  —^d  L>  meaning 
that  if  a  concept  is  subsumed  by  C,  it  is  generally  subsumed  by  D,  together  with 
a  set  Inc  of  strict  incoherence  rules  in  the  form  C  ^  J_  meaning  that  if  a  concept 
is  subsumed  by  C,  it  is  incoherent.  More  precisely,  the  definition  of  Def  and  Inc 
are  the  following: 


®  The  syntax  of  these  rules  is  defined  in  section  4.1. 
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Definition  3  (Def)  Defis  composed  ofm  rules  called  Ri,. .  .,Rm  such  that  Ri  = 
preconditioui  -^d  Conclusiorii  where  preconditioni  'Is  a  term  of  C-CLASSICf  and 
Conclusioui  a  term  of  C-CLASSIC  where  the  only  allowed  concept  conjunctions 
are  in  the  value  restriction  of  roles. 

Definition  4  (Inc)  Inc  is  composed  of  n  rules  called  Ri,. .  .,Rn  such  that  Ri  — 
preconditioni  — ^  -L  where  preconditioni  is  a  term  of  C~CLASSIC. 

A  rule  of  Def  or  Inc  is  applicable  if  its  precondition  subsumes  the  example. 


4.2  Saturation  Process 

One  of  the  main  operations  of  the  saturation  process  is  to  detect  a  potential 
incoherence  between  the  definition  of  an  example  and  the  conclusion  of  an  ap¬ 
plicable  default  rule  in  order  to  add  a  default  property  (no  incoherence  has  been 
detected)  or  an  excepted  property  (an  incoherence  has  been  detected)  to  the 
example. 

We  distinguish  two  kinds  of  incoherences:  incoherences  of  type  1  and  incoher¬ 
ences  of  type  2. 

An  incoherence  of  type  1  corresponds  to  an  incoherence  linked  to  one  or  more 
general  axioms  concerning  the  connectives  of  the  language.  For  instance,  child 
AT-LEAST  2  PI  child  AT-M OST  1  is  incoherent  and  more  generally  for  all  role 
R,  R  AT-LEAST  m  n  R  AT-M  OST  n  is  incoherent  if  m  >  n.  These  axioms 
are  expressed  in  the  equational  system  of  C-CLASSIC  [23]. 

An  incoherence  of  type  2  corresponds  to  an  incoherence  linked  to  a  rule  belong¬ 
ing  to  Inc  (e.g.  Inapt-to-fly  fl  Flies  is  incoherent). 

It  must  be  highlighted  that  in  our  framework  incoherences  are  only  linked  to 
strict  knowledge.  Indeed,  a  default  property  cannot  be  incoherent.  This  is  the 
reason  why  when  there  is  no  conflict  between  the  conclusion  of  the  default  rule 
and  the  current  description  of  e’  we  add  (^Conclusion  rather  than  Conclusion 
which  could  later  be  in  conflict  with  knowledge  issued  from  other  rules.  Besides, 
an  excepted  property  never  leads  to  an  incoherence  since  an  exception  to  a  con¬ 
cept  does  not  correspond  to  a  negation  of  this  concept.  For  instance,  Fly^  fl  Fly 
is  not  incoherent. 

The  fact  that  incoherences  are  linked  only  to  strict  knowledge  has  two  impli¬ 
cations.  First,  it  allows  us  to  be  sure  that  the  addition  of  default  and  excepted 
properties  will  not  further  lead  to  incoherences.  This  guarantees  the  monotonic¬ 
ity  of  the  extension  process.  Besides,  we  can  state  that  a  term  T  of  C-CLASSIC^e 
is  incoherent  if  and  only  if  the  term  T’  equivalent  to  T  without  default  and  ex¬ 
cepted  properties  is  incoherent.  Thus,  incoherence  of  type  1  can  be  detected  by 
translating  a  term  of  C-CLASSIC^e  into  a  term  of  C-CLASSIC  (i.e.  by  removing 

^  We  could  extend  the  process  by  using  terms  of  C-CLASSIC^c  in  the  theory  but  our 
goal  is  to  show  that  we  can  obtain  default  and  excepted  properties  from  rules  whose 
precondition  and  conclusion  are  described  using  strict  properties. 
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default  and  excepted  properties)  and  by  applying  the  normalization  procedure 
defined  for  C-CLASSIC  in  [23]^. 

The  sketch  of  the  extension  algorithm  is  the  following.  Let  e  be  an  example  de¬ 
scribed  by  a  term  C  of  C-CLASSICje,  for  each  rule  of  De/we  check  whether  C  is 
subsumed  by  the  premisse  of  the  rule.  In  order  to  achieve  this  task  default  and 
excepted  properties  of  C  are  removed  and  C’  the  term  of  C-CLASSIC  obtained  is 
compared  with  the  premisse  of  the  rule  by  applying  the  subsumption  algorithm 
designed  for  C-CLASSIC.  If  the  premisse  subsumes  C’,  we  check  if  the  conclu¬ 
sion  of  the  default  rule  is  incoherent  with  C’  (i.e.  if  it  leads  to  incoherences  of 
type  1  or  2) .  Incoherences  of  type  1  are  detected  by  computing  the  normal  form 
of  C’  D  Conclusion  using  the  normalization  algorithm  of  C-CLASSIC  terms.  If 
this  normal  form  is  equivalent  to  the  denotation  of  -L,  there  is  an  incoherence. 
Incoherences  of  type  2  are  detected  by  verifying  whether  a  premisse  of  a  rule 
belonging  to  Inc  subsumes  C’  fl  Conclusion.  If  any  incoherence  is  detected  the 
conclusion  is  excepted  and  added  to  C  (i.e.  Conclusion^  is  added  to  C)  otherwise 
the  conclusion  of  the  rule  by  default  (i.e.  ^Conclusion)  is  added  to  C. 

The  extension  algorithm  is  as  follows  : 

Inputs:  a  term  C  of  C-CLASSIC«se,  a  set  Def  ={Ri,.  •  .,Rn}  of  “default  rules”,  a 
set  Inc  =  {R’l,. .  .jR’m}  of  strict  incoherence  rules. 

Output:  ENF-C  the  extended  normal  form  of  C. 

External  procedures  used: 

Remove5e(d):  transforms  a  term  d  of  C-CLASSICie  into  a  term  of  C-CLASSIC 
by  removing  default  and  excepted  properties  of  d  (since  incoherences  concern  only 
strict  properties). 

Subsume(C,D):  returns  true  if  C  subsumes  D,  C  and  D  being  two  terms  of  C- 
CLASSIC. 

NF’(d):  computes  the  normal  form  of  a  term  d  of  C-CLASSIC. 

BEGIN 

C’  ■«—  Remove5e(C) 

For  all  Rt  of  Def  such  that  Premissei  — ►  Conclusioui  and  Subsume(Premissei,C’) 
begin 

Add  ^  false  {*  Add  is  true  if  an  excepted  property  has  been  added  *} 

*  Search  for  incoherences  of  type  1 

if  NF’(C’  n  Conclusioni)  =  _L  {*  Conclusioni  is  incoherent  with  the  description 
of  C’  and  C  *} 
then  begin 

C  <—  C  n  (Conclusion  i)® 

Add  true 
end 

*  Search  for  incoherences  of  type  2 

if  not  Add  then  if  there  exists  in  Inc  a  premisse  D  such  that  Subsume(D,C’ 
n  Conclusioni) 


®  Applying  this  procedure  on  an  incoherent  term  leads  to  normalize  the  term  by  ± 
which  denotes  incoherences. 
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then  C  C  n  (Conclusionj)^  else  C  +-  C  □  ^Conclusioni 

end  For  all 
END 

Example 

We  consider  example  1  described  in  section  3. 

Let  p  a  background  knowledge  made  of  two  sets  Inc  and  Def. 

Def  —  {Rl:  Animal  n  Has~teats  ~^d  Vivi’parous^  R2:  Animal  FI  Has-Wings  -^d 
FlieSy  R3:  Animal  fl  Lives-in-the~sea  fl  Scales  -^d  Gills  }  is  a  set  of  default  rules 
meaning  that  generally  animals  having  teats  are  viviparous,  that  generally  an¬ 
imals  having  wings  fly  and  that  generally  animals  with  scales  and  living  in  the 
sea  have  gills. 

Inc  =  {R4:  Viviparous  fl  Oviparous  _L,  R5:  Inapt~to-fly  n  Flies  — _L}  is  a  set 
of  strict  rules  respectively  meaning  that  an  example  can  not  be  both  oviparous 
and  viviparous  and  that  it  is  impossible  to  fly  and  to  be  inapt  to  fly.  We  illustrate 
now  the  saturation  process  on  the  example  1. 

e’l  =  ei 

e’2  ^62  n  Viviparous^ 
e’3  =  63 

e’4  =  e4  n  Flies^. 
ce’i  =  cei  n  S  Gills. 

Some  explanations  about  the  saturation  of  02: 

e2  verifies  (i.e.  is  subsumed  by)  the  precondition  of  Rl.  The  addition  of  Viviparous 
to  62  leads  to  an  incoherence  ( Viviparous  f!  Oviparous  is  subsumed  by  _L  from 
R4).  The  property  Viviparous^  is  added  to  e2.  Adding  this  property  makes  it 
possible  to  highlight  that  a  part  of  e2  {Oviparous)  is  incoherent  with  Animal 
n  Has-teats  ~^d  Viviparous.  This  information  can  be  useful  during  the  learning 
process  described  in  the  next  section. 

4.3  Learning  in  C-CLASSIC<5e  vs.  C-CLASSIC 

Using  the  example  1  and  the  background  knowledge  described  in  the  previous 
section,  we  show  now  that,  given  the  same  positive  and  negative  examples,  C- 
CLASSIC^e  allows  to  learn  disjunctive  concepts  represented  with  less  disjuncts 
than  concepts  learned  in  C-CLASSIC. 

LCSLEARNDIS J  is  applied  on  the  saturated  examples.  The  first  disjunct  learned 
by  the  algorithm  is  LCS(e’i,e’2)  (i.e.  Animal  n  S Viviparous^  n  Vertebrate). 
The  examples  e\  and  e’2  are  then  removed  from  the  learning  set.  The  next 
learned  disjunct  is  LCS(e’3,e’4)  that  covers  these  two  positives  examples  and  no 
negative  examples.  The  algorithm  returns  the  following  hypothesis:  {Animal  fi 
S  Viviparous  fl  Vertebrate)  V  {Animal  fl  S Flies  n  Vertebrat^. 

The  following  instance:  e  =  Animal  fl  Vertebrate  fl  Lives-in- Australia  n  Wings 

^  Note  that  this  property  does  not  belong  to  the  LCS  computed  from  the  C-CLASSIC 
definitions  of  ei  and  62.  Now,  this  property  is  crucial  since  it  prevents  the  nega¬ 
tive  example  to  be  subsumed  (let  us  remind  that  ce’i  has  the  properties  Animal  (1 
Vertebrate). 
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n  Big- feet  fl  Inapt-to-fly  is  recognized  by  the  definition  learned  in  C-CLASSIC^e: 
the  saturation  process  adds  Flies^  to  e  and  Animal  fl  5 Flies  □  Vertebrate  sub¬ 
sumes  the  saturated  instance.  Note  that  e  is  not  recognized  by  the  definition 
learned  in  C-CLASSIC  (see  section  3). 

5  Related  and  Further  Work 

Cohen  and  Hirsh  suggest  that  learning  systems  based  on  description  logics  may 
prove  to  be  a  useful  complement  to  ILP  systems.  One  issue  is  the  investigation 
of  combining  our  framework  and  non-monotonic  frameworks  in  ILP.  In  [9],  the 
authors  present  a  framework  for  learning  non-monotonic  logic  programs.  Hence 
given  a  background  theory  and  a  set  of  examples  they  generate  a  hypothesis 
within  the  language  bias  of  a  subclass  of  non-monotonic  logic  programs^®  that 
covers  all  the  positive  examples  and  none  of  the  negative  examples.  In  such  the¬ 
ories  in  order  to  decide  if  an  atom,  A,  holds  they  need  to  show  that  A  can  be 
derived  classically  using  some  rule,  r,  for  A  and  that  -lA  can  not  be  derived 
classically  using  some  rule  r’  which  is  designated  higher  than  the  rule  r  by  the 
priority  relation  on  the  program. 

For  instance,  consider  the  background  theory  B: 
hird(x)  penguin  (x) 
penguin(x)  superpenguin (x) 

bird(a),  bird(b),  penguin(c),  penguin(d)y  superpenguin(e),  superpenguin(f ) 
Consider  also  the  set  of  examples  E  —  U  E~  where  =  {flies (a),  flies(b), 
flies(e),  flies(f)}  and  E~  =  {flies(c),  flies(d)}. 

The  result  of  the  algorithm  is  the  hypothesis  H  : 

R1  :  flies(x)  bird(x) 

R2  :  -1  flies (x)  penguin(x) 

R3  :  flies  (x)  superpenguin (x) 

where  R1  has  lower  priority  than  R2  and  R2  has  lower  priority  than  R3. 

In  such  a  non-monotonic  framework,  the  goal  is  to  learn  strict  predicates  (e.g. 
Fly  whose  penguin  is  a  negative  example)  by  generating  default  rules.  This  ap¬ 
proach  is  not  suited  to  learn  strict  concepts  having  default  properties  in  their 
definition  (e.g.  Bird  whose  penguin  is  a  positive  example  despite  the  fact  it  is  ex¬ 
ceptional  w.r.t.  the  Fly  property).  A  further  work  could  consist  in  using  default 
rules  learned  in  this  non-monotonic  framework  (e.g.  Rl,  R2,  R3^^)  in  order  to 
improve  learning  in  C-CLASSIC(5e  (for  instance,  positive  and  negative  examples 
of  the  concept  Bird  could  be  saturated  with  5{Flies^)  or  {Flies^Y  using  Rl,  R2 
and  R3).  Note  that  the  problem  of  learning  with  a  non-monotonic  background 
knowledge  is  one  of  the  possible  directions  for  further  research  listed  in  [9] . 
Abduction  [10,13]  also  is  the  basis  for  non-monotonic  learning  frameworks  by 

Theories  where  their  set  of  contradictory  rules  can  be  separated  into  classes  where 
the  rules  in  each  class  are  totally  ordered  by  the  priority  relation  of  the  theory. 

The  presence  of  -•  is  not  a  problem  since  it  is  straighforward  to  add  the  negation 
on  atomic  concepts  in  C-CLASSIC<se  (and  the  axiom  A  Fl  -lA  =  ±  in  the  equational 
system  in  order  to  take  into  account  such  incoherences. 
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providing  a  uniform  technique  to  handle  negation  as  failure,  incomplete  predi¬ 
cates  and  integrity  constraints.  We  need  also  to  compare  our  work  with  these 
approaches. 

As  a  conclusion,  in  this  paper  we  have  defined  a  problem  setting  concerning 
learning  concept  in  a  framework  combining  a  description  logics  allowing  to  define 
concepts  with  default  and  excepted  properties,  and  a  background  knowledge 
represented  by  rules  and  default  rules.  We  proposed  a  prior  saturation  of  the 
examples  using  the  background  knowledge  and  we  showed  that  learning  from 
extended  examples  can  lead  to  the  construction  of  a  more  satisfactory  learned 
concept.  More  precisely,  the  learned  concepts  are  smaller  in  size  (they  have 
less  disjuncts)  and  they  are  more  general  covering  more  examples  which  can  be 
identified  as  belonging  to  the  target  concept.  The  presence  of  defaults  is  a  way 
to  improve  the  expressive  power  of  the  DL  (few  concepts  can  be  defined  with 
necessary  and  sufficient  properties  using  only  strict  knowledge)  and  therefore 
to  improve  the  learning  process.  Finally,  note  that  the  application  of  default 
rules  is  difficult  since  it  can  lead  to  ambiguities.  For  instance,  in  [19]  the  authors 
integrated  defaults  in  DLs  using  incident  rules  of  the  form  cl  -^d  c2  meaning 
“whenever  an  object  is  an  instance  of  cl  it  is  also  an  instance  of  c2  unless  this  is 
in  conflict  with  some  other  piece  of  knowledge” .  This  approach  requires  to  handle 
multi  extensions  by  defining  preference  criteria  (e.g.  the  preferred  models  contain 
the  most  specific  knowledge  or  the  most  applied  defaults).  In  our  framework,  the 
connectives  S  and  e  allow  us  to  avoid  this  problem. 
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Abstract.  In  a  combinatorial  auction  problem  bidders  are  allowed  to 
bid  on  a  bundle  of  items.  The  auctioneer  has  to  select  a  subset  of  the 
bids  so  as  to  maximize  the  price  it  gets,  and  of  course  making  sure  that 
it  does  not  accept  multiple  bids  that  have  the  same  item  as  each  item 
can  be  sold  only  once.  In  this  paper  we  show  how  the  combinatorial 
auction  problem  and  many  of  its  extensions  can  be  expressed  in  logic 
programming  based  systems  such  as  Smodels  and  dlv.  We  propose  this 
as  an  alternative  to  the  standard  syntax  specific  specialized  implemen¬ 
tations  that  are  much  harder  to  modify  and  extend  when  faced  with 
generalizations  and  additional  constraints. 


1  Introduction  and  Motivation 

In  a  simple  auction  several  bidders  bid  for  an  item  and  the  auctioneer  selects  the 
highest  bid.  Often  bidders  need  a  bundle  of  items,  where  the  worth  of  the  whole 
bundle  -  to  the  bidder  -  may  be  more  than  the  sum  of  the  individual  worth  of 
each  item  in  the  bundle.  For  example,  let  A  and  B  be  two  adjacent  real  estate 
plots.  A  single  developer  can  often  make  more  money  developing  both  plots 
together  than  two  different  developers  developing  A  and  B  separately  without 
co-operating  with  each  other.  This  happens  if  say  both  A  and  B  are  needed  to 
create  a  lucrative  golf  course  while  A  and  B  separately  can  only  be  used  for  less 
profitable  purposes.  The  opposite  may  be  true  in  some  cases  too.  The  cases  that 
are  often  mentioned  with  regards  to  both  are  airport  landing  slots  [9] ,  bandwidth 
auctions,  real  estate  auctions,  and  transportation  exchanges  [11]. 

In  such  cases  participating  in  parallel  or  sequential  auctions  for  each  items 
in  a  bundle  desired  by  a  bidder  is  risky  as  the  bidder  may  not  win  all  items 
in  the  bundle.  Moreover  it  would  be  difficult  for  him  to  individually  price  each 
item  in  the  bundle.  One  way  to  avoid  such  problems  is  to  have  combinatorial 
auctions  where  bidders  are  allowed  to  bid  on  bundles.  Although  this  is  good  for 
the  bidder,  the  seller’s  problem  of  deciding  which  bids  to  accept  becomes  harder, 
as  different  bidders  can  make  up  their  own  bundle  on  which  they  bid  on  . 

Recently,  there  has  been  a  lot  of  interest  in  this  problem  because  of  its 
applicability  in  Internet  based  auctions,  B2B  exchanges,  and  multi-agent  sys¬ 
tems  [16,7].  There  have  been  several  papers  [12,13,2,4,6,15,10]  that  analyze  this 


T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  186-199,  2001. 
©  Springer- Verlag  Berlin  Heidelberg  2001 


Declarative  Specification  and  Solution  of  Combinatorial  Auctions  187 

problem  and  present  algorithms  and  techniques  to  solve  it  and  a  few  of  its  gener¬ 
alizations.  One  starting  point  that  guides  research  on  this  is  the  result  from  [10] 
which  shows  the  problem  of  finding  the  optimal  set  of  bids  (that  maximize  the 
seller’s  take)  to  be  NP-Complete. 

So  far  there  are  three  different  approaches  for  solving  this  problem:  complete 
algorithms  [12,2,10]  that  find  an  optimal  solution  in  the  general  case,  incom¬ 
plete  methods  [4]  that  find  high  quality  solutions  quickly,  and  identification  of 
tractable  specied  cases  and  algorithms  for  those  cases  [15,13].  The  other  possible 
approach  of  finding  approximation  algorithms  is  blocked  by  the  result  from  [12] 
that  shows  that  no  polynomial  algorithm  can  guarantee  a  solution  that  is  close 
to  optimal. 

In  this  paper  we  follow  the  first  approach  of  obtaining  optimal  solutions  in  the 
general  case.  Our  methodology  is  different  from  the  earlier  approaches  [12,2,10] 
in  that  we  would  like  to  represent  the  problem  in  a  declarative  knowledge  repre¬ 
sentation  language  such  that  optimal  ‘models’  of  the  representation  correspond 
to  optimal  solutions.  This  is  similar  to  the  methodology  of  satisfiability  based 
planning  [5]  where  the  planning  problem  is  represented  as  a  propositional  the¬ 
ory,  and  each  model  of  this  theory  encodes  a  plan.  The  main  motivation  behind 
our  approach  of  using  a  declarative  knowledge  representation  language  is  that 
we  would  like  the  process  of  adding  additional  constraints,  or  making  a  gener¬ 
alization  to  he  easier.  This  differs  from  the  other  approaches  [12,13,2,6]  where 
major  changes  were  needed  to  move  from  single  unit  combinatorial  auctions  to 
multi-unit  combinatorial  auctions.  Also,  as  mentioned  in  [13]  additional  gener¬ 
alizations  necessitates  change  in  the  code,  which  requires  the  knowledge  of  the 
structure  of  the  code  and  hence  can  only  be  done  by  people  adequately  familiar 
with  the  original  code.  In  contreist  we  will  show  that  when  using  a  declarative 
knowledge  representation  language  additional  generalization,  or  addition  of  new 
constraints  often  leads  to  adding  a  few  extra  rules,  without  needing  the  detailed 
knowledge  of  the  original  code  or  its  structure. 

The  declarative  language  that  we  will  use  throughout  this  paper  is  Smod- 
els  [8,14]^,  an  extension  of  logic  programming  with  answer  set  semantics  [3].  It 
has  new  constructs  such  as  cardinality  and  weight  constraints,  and  optimiza¬ 
tion  statements.  It  is  preferable  over  propositional  logic  as  it  is  more  expressive 
in  terms  of  being  able  to  express  transitive  closure,  causality,  and  aggregation. 
Moreover,  it  is  a  non-monotonic  language  and  hence  more  suitable  for  knowledge 
representation  and  finally  it  includes  optimization  statements.  (A  more  detailed 
argument  about  the  advantages  of  logic  programming  with  answer  set  semantics 
over  other  logics  is  given  in  the  draft  of  a  book  by  the  first  author  available  at 

^  Strictly  speaking,  Smodels  is  a  system  that  started  of  as  implementing  the  answer 
set  semantics  of  logic  programs  and  now  has  several  new  constructs.  By  the  Smodels 
language  we  refer  to  the  extension  of  logic  programs  that  is  used  by  the  Smodels 
system. 

We  would  like  to  mention  that  some  of  the  encodings  in  this  paper  can  also  be 
expressed  in  the  language  of  the  dlv  system  [1].  Due  to  lack  of  space  we  only  focus 
on  the  Smodels  system. 
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http://www.public.asu.edu/~cbaral/.)  Smodels  is  preferable  over  ILP  (integer 
linear  programming)  because  it  can  represent  logical  specifications  more  easily. 
Although  ILP  can  accommodate  propositional  logic,  it  has  not  been  shown  how 
it  can  accommodate  non-monotonic  features  of  a  logic  program. 

Our  goal  in  this  paper  is  to  show  how  single  unit  and  multi  unit  combinatorial 
auction  problems  can  be  specified  and  declaratively  solved  using  Smodels,  and 
how  it  is  easy  to  add  additional  constraints  and  further  generalizations  to  the 
original  problem  using  Smodels.  We  hope  this  representation  will  serve  as  a 
benchmark  to  the  logic  programming,  knowledge  representation  and  declarative 
problem  solving  communities  to  develop  more  efficient  implementations  of  the 
Smodels  language  such  that  the  timing  of  obtaining  solutions  of  combinatorial 
auction  problems  specified  in  Smodels  is  comparable  to  that  of  the  specialized 
algorithms/programs  in  [12,13,2,6]. 

2  Background;  The  Smodels  Language 

A  logic  program  is  a  collection  of  rules  of  the  form 

uq  ^  )  ^mj  . . . ,  not  Oji  (1) 

where  ai's  are  atoms.  For  an  atom  a,  ''not  a”  is  referred  to  as  a  naf-literal. 
Intuitively,  the  above  rule  means  that  if  ai . . . are  true  and  Om+i  can 
be  assumed  to  be  false  then  oq  must  be  true.  Logic  programs  whose  rules  do  not 
have  not  in  the  body  -  referred  to  as  definite  programs  -  have  unique  answer 
sets,  which  are  the  least  models  of  the  theory  obtained  by  treating  rules  of  the 
form  ao  fli, . . . ,  Um  as  the  classical  formula  ai  A  . . .  A  D  Uq.  Given  a  logic 
program  P  and  a  set  of  atoms  the  Gelfond-Lifschitz  transformation  is 
defined  as  the  set  of  rules  obtained  from  P  by  removing  all  rules  from  P  whose 
body  contains  not  b  such  that  b  e  S,  and  then  removing  the  naf-literals  from 
the  rest  of  the  rules.  A  set  S  of  atoms  is  said  to  he  an  answer  set  of  a  logic 
program  P  if  S  is  the  answer  set  of  the  definite  program  P^ . 

In  the  Smodels  language,  each  of  the  ao, . . . ,  am  can  be  replaced  by  cardinality 
expressions  and  weight  expressions.  An  example  of  a  cardinality  expression  is: 

3  {sold{X)  :  item{X)}  6 

which  is  true  in  an  answer  set  if  the  number  of  items  that  are  sold  is  between 
(inclusively)  3  and  6.  We  can  encode  the  value  each  item  is  sold  by  a  weight 
declaration  of  the  form: 


weightso/d(a)  =  8. 

which  would  mean  that  item  a  was  sold  for  $8.  Now  the  weight  expression 

23  [sold{X)  :  item(X)]  36 

will  be  true  in  an  answer  set  if  the  total  value  of  those  items  that  are  sold  is 
between  (inclusively)  23  and  36. 
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Optimization  statements  are  syntactically  similar  to  weight  and  cardinal¬ 
ity  expressions  except  that  the  left  and  right  range  are  replaced  by  the  label 
maximize  or  minimize  in  the  left  hand  side.  For  example,  if  we  want  to  obtain 
the  answer  sets  where  the  number  of  items  that  is  sold  is  maximum  then  we 
need  to  write  the  following: 

maximize  {sold{X) :  item{X)}. 

Smodels  allows  multiple  optimization  statements  and  treats  them  as  a  compound 
optimization  through  a  lexicographic  ordering 

among  the  optimizations  statements.  A  more  formal  characterization  of  the 
Smodels  language  is  given  in  [14] . 

3  Single  Unit  Combinatorial  Auction 

We  explain  the  single  unit  combinatorial  auction  problem  through  an  exam¬ 
ple.  The  auctioneer  has  the  set  of  items  {1,2, 3, 4},  and  the  buyers  submit  bids 
{a,  6,  c,  d,  e}  where  a  constitutes  of  ({1, 2, 3},  24),  meaning  that  the  bid  a  is  for  the 
bundle  {1, 2, 3}  and  its  price  is  $24.  Similarly  6  constitutes  of  ({2, 3},  9),  c  consti¬ 
tutes  of  {{3,4},  8),  d  constitutes  of  ({2, 3,4},  25),  and  e  constitutes  of  {{1,4},  15). 
The  winner  determination  problem  is  to  accept  a  subset  of  the  bids  with  the 
stipulation  that  no  two  bids  containing  the  same  item  can  be  accepted,  so  as  to 
maximize  the  total  price  fetched.  We  now  present  an  Smodels  encoding  (which 
is  both  a  specification  and  a  program.)  of  this  example. 

3.1  Specifying  the  Domain 

1.  We  specify  the  bid  names  and  their  values  as  follows: 

bid(a).  weight  sel(a)  =  24.  bid(b).  weight  sel(b)  =  9.  bid(c),  weight  sel(c)  =  8. 
bid(d).  weight  sel(d)  ~  25.  bid(e).  weight  sel(e)  :=  15. 

2.  We  specify  the  items  as  follows:  item (1.. 4). 

3.  We  specify  the  composition  of  each  bids  -  in  terms  of  what  items  it  consists 
of,  as  follows. 

in(l,a).  in(2,a).  in(3,a).  in(2,b).  in(3,b).  in(3,c).  in(4,c). 

in(2,d).  in(3,d).  in(4,d).  in(l,e).  in(4,e). 

3.2  The  General  Rules 

We  have  the  following  general  rules  which  together  with  the  domain  specific  rules 
from  the  previous  subsection,  when  run  using  Smodels  will  give  us  the  winning 
bids. 

1.  The  following  two  rules  label  each  bid  as  either  selected  or  not  selected. 
sel{X)  <r-  bid{X),  not  not.sel{X). 
not-sel{X)  bid{X),  not  sel{X). 

They  can  be  replaced  by  the  following  single  Smodels  rule: 

{sel{X)}  4-  bid{X). 
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2.  The  following  enforces  the  constraint  that  two  diflPerent  bids  with  the  same 
items  can  not  be  both  selected. 

^  sel{X),  sel{Y),  in{I,  X),  m(/,  Y). 

The  above  Smodels  rule  does  not  follow  the  syntax  of  rules  in  Section  2. 
Such  rules  of  the  form 
<  fli , . . , ,  d^ji ,  not  , . . . ,  not  cin . 

with  empty  head  mean  that  there  can  not  be  answer  sets  that  evaluate  the 
body  true.  Such  rules  can  be  thought  of  as  the  following  rule,  where  /  is  a 
new  atom,  that  satisfies  the  syntax  of  (1)  in  Section  2. 
f  <  not  f  ^  Q-i,  .  .  .  ,  G-tti,  not  ■  5  Tiot 

3.  The  following  optimization  statement  specifies  that  we  must  select  bids  such 
that  their  total  price  is  maximized. 

maximize  [sel{X)  :  hid{X)]. 

When  we  run  the  above  program  in  the  Smodels  system  using  the  command 
Iparse  aucl.sm  |  smodels  0 

the  system  first  outputs  the  answer  set  {g},  and  then  outputs  the  optimal 
answer  set  {d},  indicating  that  the  latter  has  a  higher  total  price. 


3.3  Formal  Characterization 

In  a  combinatorial  auction  (single  unit  case),  the  auctioneer  has  m  items  M  = 
{1, . . . ,  m}  and  the  buyers  submit  n  bids  B  =  {Bi, . . . ,  where  each  bid  is 
a  tuple  Bi  =  (Si, Pi),  with  Si  C  M,  and  pi  is  a  price.  The  winner  determination 
problem  [13]  is  an  assignment  of  bids  as  accepted  (xi  —  1)  or  not  (xi  ~  0),  for 
l<i  <n  that  satisfies  the  constraint 

n 

(  ^  ^i)  <1  for  1  <  j  <  m;  and  maximizes  x 

The  above  characterization  can  be  related  to  the  Smodels  encoding  as  follows: 

Theorem  1.  For  a  single  unit  combinatorial  auction  problem  with  integer 
prices,  each  solution  to  the  winner  determination  problem  corresponds  to  an 
optimal  answer  set  of  the  encoding  described  in  3. 1-3.2  and  vice-versa. 


3.4  Encoding  in  dlv 

The  dlv  system  [1]  is  also  an  implementation  of  an  extension  of  logic  program¬ 
ming  with  additional  constructs.  It  allows  disjunctions  in  the  head  of  rules  and 
captures  the  second  level  of  polynomial  hierarchy.  Among  its  additional  con¬ 
structs  are  weak  constraints  which  are  of  the  form: 

Pi, . . .  ,pm^  not  5i, . . . ,  not  qn\u)eight :  level] 
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Given  a  program  with  weak  constraints  its  best  answer  sets  are  obtained 
by  first  obtaining  the  answer  sets  without  considering  the  weak  constraints  and 
ordering  each  of  them  based  on  the  weight  and  priority  level  of  the  set  of  weak 
constraints  they  violate,  and  then  selecting  the  ones  that  violate  the  minimum. 
In  presence  of  both  weight  and  priority  level  information,  the  minimization  is 
done  with  respect  to  the  weight  of  the  constraints  of  the  highest  priority,  then 
the  next  highest  priority  and  so  on. 

The  encoding  in  Section  3.1  and  3.2  can  be  alternatively  encoded  in  dlv^  as 
follows: 

1.  We  have  the  bid  atoms  from  part  1  of  Section  3.1,  and  in  atoms  from  part  3 
of  Section  3.1. 

2.  We  can  either  have  the  rules  in  part  1  of  Section  3.2  or  use  disjunction  and 
have  the  rule:  sel{X)  V  not-sel{X)  bid{X). 

3.  We  have  the  constraint  in  part  2  of  Section  3.2. 

4.  Finally  instead  of  the  optimization  statement  in  part  3  of  Section  3.3,  we 
have  the  following  weak  constraints. 

not  sel{a).[2A  :  1] 
not  sel{b).[9  :  1] 
not  sel{c).[S  :  1] 
not  sel{d).[2b  :  1] 
not  5e/(e).[15  :  1] 

4  Combinatorial  Auction  with  CNF  Bids 

In  this  section  we  show  how  the  single  unit  combinatorial  auction  specification 
can  be  generalized  such  that  a  bidder  can  specify  some  options  between  his  bids. 
For  example  a  bidder  may  want  to  specify  that  only  one  of  his  bids  g  and  h  be 
accepted,  but  not  both.  This  can  be  generalized  further  to  such  that  a  bidder 
can  specify  a  CNF^  bid  [4]  which  is  a  conjunction  of  (ex-or)  disjunction  of  items 
such  that  one  item  from  each  of  the  conjuncts  is  awarded  to  the  bidder.  (An 
alternative  way  to  achieve  this  is  by  opening  up  the  CNF  to  several  bids  and 
adding  a  dummy  [2]  item  to  each  of  the  bids  so  that  exactly  one  of  them  is 
selected.)  As  before  we  show  our  encoding  with  respect  to  an  example. 

1.  We  will  have  the  domain  specification  as  in  part  1  and  2  of  Section  3.1  and 
the  general  rules  in  part  1  and  3  of  Section  3.2. 

2.  Recall  that  a  CNF  bid  is  not  a  bundle  of  items,  rather  it  could  be  of  the 
following  form:  a  =  (5I  ©  hi)  A  {g2  ©  h2)  A  (^3  ©  /i3) 

which  means  that  the  bid  a  can  be  satisfied  by  granting  one  of  the  items  gl 
and  hi,  one  of  the  items  g2  and  h2  and  one  of  the  items  ^3  and  h3.  We  can 
represent  this  in  Smodels  as  follows: 

^  In  the  future  we  plan  to  compare  the  timings  using  the  dlv  system  with  the  timings 
using  the  Smodels  system. 

^  Although  the  use  of  CNF  is  somewhat  misleading,  we  use  it  to  be  consistent  with 
the  original  terminology  in  [4]. 
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conj(cl,  a).  disj(gl,  cl).  disj(hl,  cl). 
conj(c2,  a).  disj(g2,  c2).  disj(h2,  c2). 
conj(c3,  a).  disj(g3,  c3).  disj(h3,  c3). 

This  will  replace  the  representation  in  part  3  of  Section  3.1. 

3.  Now  it  is  not  enough  to  just  label  bids  as  selected  or  un-selected.  After 
labeling  a  bid  as  selected  we  must  identify  which  items  are  granted  as  part 
of  that  selected  bid.  We  have  the  following  rules  to  encode  that. 

other -granted{X,  C,  G)  ^  granted{X,  C,  G'),  G'  ^  G. 
granted{X,  G,  G)  ^  sel{X),  conj{C,  X), 

disj{G,  G),  not  other. granted{X,  G,  G). 
Intuitively,  granted(X^  G,  G)  means  that  as  part  of  the  selection  of  bid  X, 
to  satisfy  the  conjunct  G,  item  G  is  granted;  and  other. granted{X,  G,  G) 
means  that  some  item  other  than  G  has  been  granted.  The  above  two  rules 
ensure  that  for  any  selected  bid  X,  and  its  conjunct  G  exactly  one  item  in 
that  conjunct  is  granted  in  each  answer  set. 

4.  Because  of  the  difference  between  a  CNF  bid  and  a  simple  bid  consisting  of 
a  bundle,  part  2  of  Section  3.2  needs  to  be  replaced  by  the  following  rule, 
so  as  to  enforce  that  we  should  not  select  two  bids  and  grant  the  same  item 
with  respect  to  both. 

6id(X),  bid{Y),  granted{X,  G,  G),granted{Y,  G',  G),  X  ^Y. 

5  Multi-unit  Combinatorial  Auction 

Multi-unit  combinatorial  auction  is  a  generalization  of  the  single  unit  case,  where 
the  auctioneer  may  have  multiple  identical  copies  of  each  item  and  the  bids  may 
specify  multiple  units  of  each  item.  The  goal  here  is  same  as  before:  to  maximize 
the  total  price  that  is  fetched;  but  the  condition  is  that  the  bids  should  be 
selected  such  that  for  any  item  the  total  number  that  is  asked  by  the  selected 
bids  should  not  be  more  than  the  number  that  is  originally  available  for  that 
item.  As  before,  we  describe  our  Smodels  encoding  with  respect  to  an  example: 
first  the  specification  for  a  particular  domain,  and  then  a  set  of  general  rules. 


5.1  Specifying  the  Domain 

1.  The  bid  names  and  their  values  are  specified  as  in  part  1  of  3.1. 
bid  (a),  weight  sel(a)  —  23. 

bid(b).  weight  sel(b)  =  9. 
bid(c).  weight  sel(c)  =  8. 
bid(d)-  weight  sel(d)  —  25. 
bid(e).  weight  sel(e)  =  15. 

2.  We  specify  the  items  and  their  initial  quantities  as  follows: 
item(i).  item(j).  item(k).  item(l). 

limit(i,8).  limit(j,10).  limit(k,6).  limit(l,12). 

3.  We  specify  the  composition  of  each  bid  as  follows: 
in(i,a,6).  in(j,a,4).  in(k,a,4). 


Declarative  Specification  and  Solution  of  Combinatorial  Auctions  193 


Intuitively,  the  above  means  that,  bid  ‘a’  is  for  6  units  of  item  ‘i’,  4  units  of 

item  ‘j’,  and  4  units  of  item  ‘k’. 

in(j,b,6).  in(k,b,4). 

in(k,c,2).  in(l,c,10). 

in(j,d,4).  in(k,d,2).  in(l,d,4). 

in(i,e,6).  in(l,e,6). 


5.2  The  General  Rules 

We  have  the  following  general  rules  which  together  with  the  domain  specific  rules 
of  the  previous  subsection,  when  run  using  Smodels  will  give  us  the  winning  bids. 

1.  The  following  two  rules  label  each  bid  as  either  selected  or  not  selected. 
sel{X)  <—  bid(X),  not  not^sel{X). 

not-sel{X)  <—  bid{X),  not  sel{X). 

2.  The  following  rule  defines  sel-in(Iy  X,  Z),  which  intuitively  means  that  bid  X 
is  selected,  and  Z  units  of  item  I  is  in  bid  X. 

seLm(/,  X,  Z)  4-  item{I),  bid{X),  sel{X),in{I^  X,  Z). 

3.  The  following  weight  declaration  assigns  the  weight  Z  to  the  atom 
seLm(X,y,  Z).  weight  seLin{X,Y,  Z)  =  Z. 

The  above  weight  assignment  is  used  in  the  next  step  to  compute  the  total 
quantity  of  each  item  in  the  selected  bids. 

4.  The  following  rule  enforces  the  constraint  that  for  each  item,  the  total  quan¬ 
tity  that  is  to  be  encumbered  towards  the  selected  bids  must  be  less  than  or 
equal  to  the  initial  available  quantity  of  that  item. 

^  y'[seLm(7,  X,  Z)  :  bid{X)  :  num(Z)],  item{I),  T),  T'  =  T  +  1. 

5.  As  before  we  have  the  following  optimization  statement. 
maximize  [sel{X)  :  bid{X)]. 

When  the  above  program  is  run  through  Smodels  using  the  command 
Ipaxse  file.sm  |  smodels  0 

the  system  first  outputs  the  answer  set  {sel{d),sel{a)},  and  then  outputs 
another  answer  set  {se/(e),  5e/(d),  sel(b)}  and  mentions  that  the  latter  one  is 
optimal. 

Thus  the  Smodels  system  starts  off  with  a  sub-optimal  solution,  and  keeps 
giving  better  and  better  solutions  until  an  optimal  solution  is  found.  We  refer  to 
this  as  exhibiting  a  weak  anytime  behavior  as  after  the  first  solution  is  found,  a 
user  may  interrupt  the  system  at  any  time  and  get  a  sub-optimal  solution  which 
improves  with  time.  Since  there  is  no  guarantee  that  the  first  solution  will  be 
found  within  a  certain  time  bound  we  use  the  qualifier  ‘weak’  with  the  adjective 
‘anytime’. 


5.3  Formal  Characterization 

In  the  multi-unit  case,  the  auctioneer  has  Uj  units  of  each  item  j,  1  <  j  < 
and  each  bid  Bi  is  of  the  form  ((A,^ . . . ,  Af  ),Px>,  where  A{  denotes  the  number  of 
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units  of  item  j  that  is  part  of  the  bid  Bi.  In  this  case,  the  winner  determination 
problem  [13]  is  an  assignment  of  bids  as  accepted  {xi  =  1)  or  not  {xi  =  0),  for 
1  <  i  <  n  that  satisfies  the  constraint 

n 

y^i'xxi)  <Uj  for  1  <  J  <  m;  and  maximizes 

t=i 

Theorem  2.  For  a  multi-unit  combinatorial  auction  problem  integer  prices, 
each  solution  to  the  winner  determination  problem  corresponds  to  an  answer 
set  of  the  encoding  described  in  5. 1-5.2  and  vice-versa. 

6  Combinatorial  Exchanges 

A  combinatorial  exchange  is  a  further  generalization,  where  we  have  buyers  and 
sellers.  The  buyers  bid  as  before,  while  the  sellers  offer  their  items  for  a  price.  The 
job  of  the  exchange  is  to  accept  a  subset  of  the  bids  of  the  buyers  and  sellers  such 
that  it  maximizes  the  surplus  (the  amount  it  obtains  from  the  buyers  minus  the 
amount  it  has  to  pay  to  the  sellers) ,  subject  to  the  condition  that  for  each  item, 
the  total  number  it  obtains  from  the  selected  seller  bids  is  more  than  what  it  has 
to  give  in  lieu  of  the  selected  buyer  bids.  Note  that  the  maximization  condition 
guarantees  that  the  exchange  does  not  lose  money  outright.  This  is  because  by 
not  accepting  any  bids  the  surplus  will  be  zero.  So  when  the  exchange  accepts 
some  bids  its  surplus  would  have  to  be  positive.  We  now  describe  our  Smodels 
encoding  for  multi-unit  combinatorial  exchanges  through  a  slight  modification 
of  the  example  in  Section  5.  The  modification  is  that  instead  of  specifying  the 
initial  quantity  of  each  item,  we  create  a  seller  /,  who  offers  those  quantities  for 
a  price. 

1.  We  have  part  1  and  part  3  of  Section  5.1  and  only  the  items  listing  of  part  2 
of  5.2.  We  do  not  have  the  description  of  the  initial  quantity  for  the  items. 
Instead  the  bid  for  the  seller  /  is  specified  as  follows: 

bid(f).  weight  sel(f)  ==  -50. 

in(i,f,-8).  in(j,f,-10).  in{k,f,-6).  in(l,f,-12). 

A  sellers  bid  is  distinguished  from  a  buyers  bid  by  having  a  negative  price 
for  the  whole  bid  (meaning  the  seller  wants  money  for  those  items,  instead 
of  being  ready  to  give  a  certain  amount  of  money),  and  similarly  the  atom 
in{i,  /,  —8)  means  that  the  seller  /  has  8  units  of  item  i  to  sell,  while  in(i,  a,  6) 
would  mean  that  the  buyer  a  wants  to  buy  6  units  of  item  i. 

2.  We  have  part  1,  2,  3,  and  5  of  Section  5.2  and  we  replace  part  4  by  the 
following  rule. 


n 

X  Xi. 

i=l 


Y  [seLin{I,X,Z)  :  bid{X)  :  num{Z)]  Y,item{I),Y  >  0. 
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The  above  rule  enforces  the  constraint  that  for  each  item  I,  the  total  number 
encumbered  with  respect  to  the  selected  buyer  bids  should  be  less  than  or 
equal  to  the  sum  that  is  available  from  the  selected  seller  bids.  Note  that 
the  use  of  the  same  variable  Y  as  the  upper  and  lower  bound  of  the  weight 
constraint  serves  the  purpose  of  computing  the  aggregate.  ^ 

3.  Although  the  following  rule  is  normally  not  necessary,  as  it  is  taken  care  of 
by  the  maximize  statement,  by  having  it  we  can  exploit  the  weak  anytime 
behavior.  It  also  eliminates  selections,  where  the  exchange  may  lose  money, 
earlier  in  the  process. 

^  Y[sel(X) :  bid{X)]Y,Y  <  0. 


When  we  run  the  above  program  through  Smodels  it  tells  us  to  not  select  any 
bids.  This  is  expected  because  the  maximum  amount  that  can  be  obtained  from 
the  buyers  is  $49  by  selecting  b,  d  and  e;  but  to  satisfy  that  we  have  to  select  f, 
which  costs  $50,  resulting  in  a  net  loss  to  the  exchange.  On  the  other  hand  if 
we  change  our  example,  and  assign  -45  as  the  weight  of  sel(f),  then  the  Smodels 
output  is  indeed  to  select  b,  d,  e,  and  f. 

6.1  Formal  Characterization 

In  c£ise  of  a  combinatorial  exchange,  instead  of  a  single  auctioneer,  we  have  many 
sellers,  who  also  present  bids,  but  in  their  bids  the  A^s  and  piS  are  negative 
numbers  denoting  the  fact  that  they  want  to  sell  (instead  of  buy)  those  items 
and  they  want  to  be  paid  (rather  than  they  are  willing  to  pay).  Here  the  winner 
determination  problem  [13]  is  an  assignment  of  bids  eis  accepted  (xi  —  1)  or  not 
(aji  —  0) ,  for  1  <  i  <  n  that  satisfies  the  constraint 

n  71 

(y^  Aj  X  jji)  <  0  for  1  <  j  <  m;  and  maximizes  ^ 

i=l  i=l 

Theorem  3.  For  a  multi-unit  combinatorial  exchange  problem  with  integer 
prices,  each  solution  to  the  winner  determination  problem  corresponds  to  an 
optimal  answer  set  of  the  above  encoding  and  vice-versa. 

7  Expressing  Additional  Constraints 

In  this  section  we  show  how  further  generalizations  and  additional  constraints 
can  be  easily  expressed  in  Smodels. 

^  But  the  Smodels  requirement  of  having  a  domain  variable  for  Y  (not  shown  in  the 
above  rule)  makes  it  an  inefficient  way  to  compute  aggregation.  Having  an  efficient 
computation  of  aggregates  together  with  the  answer  set  semantics  remains  a  chal¬ 
lenge. 
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1.  Suppose  we  would  like  to  express  the  constraint  that  item  1  must  be  sold. 
We  can  achieve  this  by  adding  the  following  rules: 

sold{X)  4-  item{X)Md{y),  sel{Y),  in{X,  Y). 

4—  not  sold{l). 

2.  Suppose  we  would  like  to  have  reserve  prices®  in  the  single  unit  combinatorial 
auction.  This  can  be  encoded  by  the  following  modification  of  the  program  in 
Section  3.  The  main  change  is  that  we  replace  bid{X)  by  bid{X^  Y)  where  Y 
was  originally  the  weight  of  bid{X).  This  change  allows  us  to  compare  the 
sum  of  the  reserve  prices  of  the  items  in  a  bid  with  the  bid  price,  which  now 
is  the  parameter  Y  instead  of  the  weight  of  bid{X). 

As  regards  to  the  specification  of  the  domain,  the  bids  are  specified  as  follows: 
bid(a,24).  bid(b,9).  bid(c,8).  bid(d,25).  bid(e,15). 

The  composition  of  items  and  bids  are  as  in  part  2,  and  3  of  3.1.  The  general 
rules,  as  described  below  are  different  from  the  ones  in  3.2. 

(a)  The  following  two  rules  label  each  bid  as  either  selected  or  not  selected. 
The  third  rule  assigns  a  weight  to  sel(X,Y). 

sel{X,  Y)  4—  bid{X^  K),  not  not.sel{X,  Y). 
not.sel{X,  Y)  4—  bid{X,  T),  not  sel{X^  Y), 
weight  sel{X,Y)  =  T. 

(b)  The  following  enforces  the  constraint  that  two  different  bids  with  the 
same  items  can  not  be  both  selected. 

4-  sel{X,  N),  sel{Yy  N')^X  ^  Y,  ^  in{I ,  X),  in{I^  Y). 

(c)  We  have  the  following  optimization  statement. 
maximize  [sel{X,Y)  :  bid{X,Y)]. 

(d)  We  express  the  reserve  price  of  each  item  by  the  following: 
rp(l,2).  rp(2,8).  rp(3,8).  rp(4,12). 

(e)  The  following  rules  compute  the  sum  of  the  reserve  prices  of  bids  and 
compare  them  with  the  bid  price  and  eliminate  possible  answer  sets 
where  the  bid  price  of  a  selected  bid  is  less  than  the  sum  of  the  reserve 
prices  of  items  in  that  bid. 

Bid,  Resjpr)  4—  in{Item,  Bid),rp{Item,  Resjpr). 
weight  injrp{Item,  Bid,  Resjpr)  ~  Resjpr. 
item-num{X,Y)  4-  item(X),num{Y). 

4—  C  [injrp{Item,  Bid,  Resjpr)  :  itemjnum{Item,  Resjpr)]  C, 
bid{Bid,  Bid.pr),  sel{Bid,  Bidjpr),  Bid^pr  <  C. 

3.  Suppose  we  would  like  to  have  a  constraint  that  item  1  and  3  must  not  go 
to  the  same  bidder.  In  the  simple  case  if  we  assume  that  each  bid  is  by  a 
different  bidder  we  can  encode  this  by  the  following  rule. 

4-  bid{X,  Y),sel{X,  Y),in{l,X),  m(3,  X). 

®  In  simple  auctions  reserve  price  of  an  item  is  the  minimum  price  a  seller  would  accept 
for  that  item.  Its  extension  [13]  to  combinatorial  auctions  will  become  clear  below. 
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4.  In  the  more  general  case  where  each  bid  has  an  associated  bidder  we  first 
need  to  express  this  association  as  follows: 

bidder  (a,  john).  bidder  (b,  mary).  bidder  (c,  john). 
bidder (d,mary).  bidder (e,  peter). 

Next  we  need  the  following  rules: 

goesJo{Item,  Bidder)  in{Item^  X),  bidder {X,  Bidder),  sel{X^  Y). 

<—  goes-to{\,B),goesJLo{2t,  B). 

Similarly,  if  we  want  to  specify  that  the  items  1  and  3  must  go  to  the  same 
bidder,  then  the  last  rule  can  be  replaced  by  the  following  rules. 
goes-to{l,  B),  not  goesdo{2t,  B). 
goes-to{3,  B),  not  goes-to{l,  B). 

5.  Suppose  we  would  like  to  represent  the  constraint  that  every  bidder  must 
return  happy,  i.e.,  at  least  one  of  her  bid  must  be  satisfied.  This  can  be 
expressed  by  the  following: 

happy  (Bidder)  ^  bidder  (X,  Bidder),  bid(X,  Y),  sel(X,  Y), 

4—  bidder(Bid,  Bidder), not  happy  (Bidder). 

6.  Suppose  the  seller  wants  to  only  deal  with  whole  sellers.  I.e.,  it  wants  to 
have  the  constraint  that  it  only  selects  bids  of  a  bidder  if  the  total  money 
to  be  obtained  from  that  bidder  is  more  than  $100.  This  can  be  achieved  by 
adding  the  following  rules. 

sel(Bid,  Value,  Bidder)  <—  bid(Bid,  Value),  sel(Bid,  Value), 
bidder  (Bid,  Bidder),  weight  sel(Bid,Value,  Bidder)  —  Value. 
total(Bidder,C)  C  [sel(Bid,Value,  Bidder)  :  bi(i(Bid,V  alue)\  C. 

total(Bidder,  C),C  <  100. 

7.  Suppose  the  seller  wants  to  avoid  bid  ‘a’  as  it  came  late,  unless  it  includes 
an  item  that  is  not  included  in  any  other  bids.  This  can  be  expressed  by  the 
following  rules. 

ow -Cover ed(Bid,  Item)  ^  in(Item,  Bid'),  Bid  ^  Bid', 
not -ow -Cover ed(Bid)  in(Item,  Bid),  not  ow -cover ed(Bid,  Item). 

sel(a,  Value),  not  not-ow -cover ed(a). 

8.  To  check  inventory  costs  the  seller  may  require  that  no  more  than  5  unsold 
items  should  be  left  after  the  selection.  This  can  be  expressed  by  the  following 
rules. 

sold(I)  ^  item(I),  bid(X,  Y),  sel(X,  Y),  in(I,  X). 
unsold(I)  <—  item(I),  not  sold(I). 

C  {unsold(I)  :  item(I)}  C,  C  >  5. 

9.  To  contain  shipping  and  handling  costs  the  seller  may  require  that  bids 
should  be  accepted  such  that  at  least  5  items  go  to  each  bidder.  This  can  be 
expressed  by  the  following  rules. 

count(Bidder,  C)  C  {goes-to(Item,  Bidder)  :  item(Item)}  C, 
bidder (B,  Bidder). 

bidder(B,  Bidder),  count  (Bidder,  C),C  <5. 

10.  If  item  ‘a’  is  a  family  treasure  the  seller  may  require  that  it  can  only  be  sold 
to  bidder  john  or  mary,  his  relatives.  This  can  be  expressed  by  the  following 
rule. 

^  goes-to(a,X),X  ^  john,X  7^  mary. 
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The  above  shows  how  additional  constraints  and  generalizations  can  be  easily 
expressed  as  new  Smodels  rules  and  often  we  do  not  have  to  change  the  original 
program,  but  just  have  to  add  new  rules. 


8  Conclusion 

In  this  paper  we  showed®  how  the  combinatorial  auction  problem  and  its  general¬ 
izations  can  be  expressed  and  solved  using  the  declarative  knowledge  representa¬ 
tion  language  of  Smodels.  We  argued  that  the  declarativeness  of  Smodels  allows 
us  to  easily  make  generalizations  and  add  additional  constraints.  Although  our 
focus  was  more  on  knowledge  representation,  we  ran  some  experiments  with  re¬ 
spect  to  synthetic  examples  following  the  approach  of  [6,12].  In  case  of  single-unit 
bids,  our  results  have  been  comparable  to  those  reported  in  [12].  In  case  of  multi¬ 
unit  bids  with  synthetic  data  drawn  from  a  decay  distribution  [6]  we  obtained 
reasonable  timings  for  bundle  sizes  up  to  1500,  with  150  items.  Our  timings  were 
worse  than  [6]  though.  We  did  not  compare  with  the  timings  in  [4,13],  as  the 
first  one  is  about  an  incomplete  algorithm  and  the  second  one  does  not  report 
timings.  We  hope  the  programs  in  this  paper  would  serve  as  a  benchmark  and  a 
challenge  to  researchers  in  logic  programming,  declarative  problem  solving  and 
knowledge  representation  in  terms  of  having  faster  implementations  of  Smodels. 
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Abstract.  In  this  paper  bounded  model  checking  of  asynchronous  con¬ 
current  systems  is  introduced  as  a  promising  application  area  for  answer 
set  programming.  As  the  model  of  asynchronous  systems  a  generalization 
of  communicating  automata,  1-safe  Petri  nets,  are  used.  It  is  shown  how 
a  1-safe  Petri  net  and  a  requirement  on  the  behavior  of  the  net  can  be 
translated  into  a  logic  program  such  that  the  bounded  model  checking 
problem  for  the  net  can  be  solved  by  computing  stable  models  of  the 
corresponding  program.  The  use  of  the  stable  model  semantics  leads  to 
compact  encodings  of  bounded  reachability  and  deadlock  detection  tasks 
as  well  as  the  more  general  problem  of  bounded  model  checking  of  linear 
temporal  logic.  Some  experimental  results  on  solving  deadlock  detection 
problems  using  the  translation  and  the  Smodels  system  are  presented. 


1  Introduction 

In  this  paper  we  put  forward  symbolic  model  checking  [2,3]  as  a  promising  appli¬ 
cation  area  for  answer  set  programming  systems.  In  particular,  we  demonstrate 
how  bounded  model  checking  problems  of  asynchronous  concurrent  systems  can 
be  reduced  to  computing  stable  models  of  logic  programs. 

Verification  of  asynchronous  systems  is  typically  done  by  enumerating  the 
set  of  reachable  states  of  the  system.  Tools  based  on  this  approach  (with  various 
enhancements)  include,  e.g.,  the  Spin  system  [12],  which  supports  extended  state 
machines  communicating  through  FIFO  queues,  and  the  PROD  tool  [17]  based 
on  Petri  nets.  The  main  problem  with  enumerative  model  checkers  is  the  amount 
of  memory  needed  to  store  the  set  of  reachable  states. 

Symbolic  model  checking  is  widely  applied  especially  in  hardware  verifica¬ 
tion.  The  main  analysis  technique  is  based  on  (ordered)  binary  decision  diagrams 
(BDDs).  In  many  cases  the  set  of  reachable  states  can  be  represented  very  com¬ 
pactly  using  a  BDD  encoding.  Although  the  approach  has  been  successful,  there 
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Model  Checking”  [11]  presented  at  the  AAAI  Spring  2001  Symposium  on  Answer  Set 
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are  difficulties  in  applying  BDD-based  techniques,  in  particular,  in  areas  outside 
hardware  verification.  The  key  problem  is  that  some  Boolean  functions  do  not 
have  a  compact  representation  as  BDDs  and  the  size  of  the  BDD  representation 
of  a  Boolean  function  is  very  sensitive  to  the  variable  ordering  used.  Bounded 
model  checking  [1]  has  been  proposed  as  a  technique  for  overcoming  the  space 
problem  by  replacing  BDDs  with  satisfiability  (SAT)  checking  techniques  be¬ 
cause  typical  SAT  checkers  use  only  polynomial  amount  of  memory.  The  idea  is 
roughly  the  following.  Given  a  sequential  digital  circuit,  a  (temporal)  property 
to  be  verified,  and  a  bound  n,  the  behavior  of  a  sequential  circuit  is  unfolded  up 
to  n  steps  as  a  Boolean  formula  S  and  the  negation  of  the  property  to  be  veri¬ 
fied  is  represented  as  a  Boolean  formula  R.  The  translation  to  Boolean  formulae 
is  done  so  that  S  A  R  is  satisfiable  iff  the  system  has  a  behavior  violating  the 
property  of  length  at  most  n.  Hence,  bounded  model  checking  provides  directly 
interesting  and  practically  relevant  benchmarks  for  any  answer  set  programming 
system  capable  of  handling  propositional  satisfiability  problems. 

Until  now  bounded  model  checking  has  been  applied  to  synchronous  hard¬ 
ware  verification  and  little  attention  has  been  given  to  knowledge  representation 
issues  such  as  developing  concise  and  efficient  logical  representation  of  system  be¬ 
havior.  In  this  work  we  study  the  knowledge  representation  problem  and  employ 
ideas  used  in  reducing  planning  to  stable  model  computation  [15].  The  aim  is  to 
develop  techniques  such  that  the  behavior  of  an  asynchronous  concurrent  system 
can  be  encoded  compactly  and  the  inherent  concurrency  in  the  system  could  be 
exploited  in  model  checking  the  system.  To  illustrate  the  approach  we  use  a 
simple  basic  Petri  net  model  of  asynchronous  systems,  1-safe  Place/Transition 
nets,  which  is  an  interesting  generalization  of  communicating  automata  [5] . 

The  structure  of  the  rest  of  the  paper  is  the  following.  In  the  next  section 
we  introduce  Petri  nets  and  the  bounded  model  checking  problem.  Then  we  de¬ 
velop  a  compact  encoding  of  bounded  model  checking  as  the  problem  of  finding 
stable  models  of  logic  programs.  We  first  show  how  to  treat  reachability  prop¬ 
erties  such  as  deadlocks  and  then  demonstrate  how  to  extend  the  approach  to 
cope  with  properties  expressed  in  linear  temporal  logic  (LTL) .  We  discuss  initial 
experimental  results  and  end  with  some  concluding  remarks. 


2  Petri  Nets  and  Bounded  Model  Checking 

We  will  now  introduce  P/T-nets.  They  axe  one  of  the  simplest  forms  of  Petri 
nets.  We  will  use  as  a  running  example  the  P/T-net  presented  in  Fig.  1. 

A  triple  (P,T,  F)  is  a  net  if  P  fl  T  =  0  and  F  C  (P  x  T)  U  {T  x  P).  The 
elements  of  P  are  called  places,  and  the  elements  of  T  transitions.  Places  and 
transitions  are  also  called  nodes.  The  places  are  represented  in  graphical  notation 
by  circles,  transitions  by  squares,  and  the  flow  relation  F  with  arcs.  We  identify  F 
with  its  characteristic  function  on  the  set  (P  x  T)  U  (T  x  P).  The  preset  of  a 
node  X,  denoted  by  *x,  is  the  set  {y  e  P  U  T\F{y,x)  =  1}.  In  our  running 
example,  e.g.,  *t2  =  {pl,p2}.  The  postset  of  a  node  x,  denoted  by  x*,  is  the  set 
{y  e  Put  \  F{x,  y)  =  l}.  Again  in  our  running  example  p2*  —  {t2,  t3,  t5}. 
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Fig.  1.  Running  Example 


A  marking  of  a  net  (P,  T,  F)  is  a  mapping  P  i-^  IN.  A  marking  M  is  identified 
with  the  multi-set  which  contains  M  (p)  copies  of  p  for  every  p  G  P.  A  4- tuple  S  = 
(P,  r,  P,  Mo)  is  a  net  system  (also  called  a  P/T-net)  if  (P,  P,  F)  is  a  net  and  Mq 
is  a  marking  of  {P,T,F).  A  marking  is  graphically  denoted  by  a  distribution  of 
tokens  on  the  places  of  the  net.  In  our  running  example  in  Fig.  1  the  net  has  the 
initial  marking  Mq  =  {pl,p2}. 

A  marking  M  enables  a  transition  t  G  T  if  Vp  G  P  :  P(p,  t)  <  M(p).  If  t  is 
enabled,  it  can  occur  leading  to  a  new  marking  (denoted  M  M'),  where  M'  is 
defined  by  Vp  €  P  :  M'(p)  =  M(p)  —  P(p,  t)  +  F{t,p).  In  the  running  example  t2 
is  enabled  in  the  initial  marking  Mq,  and  thus  Mq  ^  M',  where  M'  ~  {p3,p4}. 

A  marking  Mn  is  reachable  in  F  if  there  is  an  execution^  i.e.,  a  (possibly 
empty)  sequence  of  transitions  ti,  ^2,  ♦  •  • ,  and  markings  Mi,  M2, . .  - ,  Mn~i 
such  that:  Mo  ^  Mi  ^  ...M„_i  ^  M„.  A  marking  M  is  reachable  within 
a  bound  n,  if  there  is  an  execution  with  <  n  transitions,  with  which  M  is  reach¬ 
able. 

A  marking  M  is  1-safe  if  Vp  G  P  :  M(p)  <  1.  A  P/T-net  is  1-safe  if  all  its 
reachable  markings  are  1-safe.  We  will  restrict  ourselves  to  finite  P/T-nets  which 
are  1-safe,  and  in  which  each  transition  has  both  nonempty  pre-  and  postsets. 

Given  a  1-safe  P/T-net  2,  we  say  that  a  set  of  transitions  5  C  T  is  concur¬ 
rently  enabled  in  the  marking  M,  if  (i)  all  transitions  t  G  5  are  enabled  in  M, 
and  (ii)  for  all  pairs  of  transitions  t,t'  G  5,  such  that  t  ^  t',  it  holds  that 

*t  n  *t'  —  0.  If  a  set  S  is  concurrently  enabled  in  the  marking  M,  we  can  fire  it 
s 

in  a  step  (denoted  M  — >  M'),  where  M'  is  the  marking  reached  after  firing  all 
of  the  transitions  in  the  step  S  in  arbitrary  order.  It  is  easy  to  prove  by  using 
the  1-safeness  of  the  P/T-net  2  that  all  possible  interleavings  of  transitions  in  a 
step  S  are  enabled  in  M,  and  that  they  all  lead  to  the  same  final  marking  M' .  In 
our  running  example  in  the  marking  M'  =  {p3,p4}  the  step  {tl,t4}  is  enabled, 

and  will  lead  back  to  the  initial  marking  Mq.  This  is  denoted  by  M'  Mq. 

Notice  also  that  for  any  enabled  transition,  the  singleton  set  containing  only 
that  transition  is  always  (trivially)  a  step. 
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We  say  that  a  marking  Mn  is  reachable  in  step  semantics  in  a  1-safe  P/T-net  if 

there  is  a  step  execution^  i.e.,  a  (possibly  empty)  sequences  52, . . . ,  5n  of  steps 

s  s  s 

and  Ml,  M2, Mn_  1  of  markings  such  that:  Mo  Mi  -4  ...M„_i  Mn. 
A  marking  M  is  reachable  within  a  bound  n  in  the  step  semantics,  if  there  is  a 
step  execution  with  at  most  n  steps,  with  which  M  is  reachable. 

We  will  refer  to  the  “normal  semantics”  as  interleaving  semantics.  Note  that 
if  a  marking  is  reachable  in  n  transitions  in  the  interleaving  semantics,  it  is 
also  reachable  in  n  steps  in  the  step  semantics.  However,  the  converse  does  not 
necessarily  hold.  We  have,  however,  the  following  theorem. 

Theorem  1.  For  finite  1-safe  P/T-nets  the  set  of  reachable  markings  in  the 
interleaving  and  step  semantics  coincide. 

Linear  temporal  logic  (LTL).  The  linear  temporal  logic  LTL  is  one  of  the  most 
widely  used  logic  for  specifying  properties  of  reactive  systems  [3].  The  basic  idea 
is  to  specify  properties  that  the  system  should  have  using  LTL.  A  model  checker 
is  then  used  to  check  whether  all  (infinite)  behaviors  of  the  system  are  models 
of  the  specification  formula.  If  not,  then  the  model  checker  outputs  a  behavior 
of  the  system  which  violates  the  given  specification. 

Given  a  finite  set  AP  of  atomic  propositions,  the  syntax  of  LTL^  is  given  by: 

V?  ::=  p  G  AP  1  ->(^1  I  <pi  V  (p2  I  A  <p2  I  <^2  I  ^1  -R <^2  • 

An  LJ-word  over  2^^  is  an  infinite  sequence  w  =  xqX\  ...  such  that  xi  G  2^^ 
for  all  2  >  0.  For  an  u;-word  w  we  define  w^i)  =  Xj,  and  denote  by  the  suffix 
of  w  starting  at  Xi.  We  define  the  relation  w  \=  (p  inductively  as  follows: 

—  to  (=  p  iff  p  G  iU(o)  for  p  G  AP 
~~  w  \=  -i(pi  iff  not  w  \=  (fi 

—  w  \=  (fi  V  (p2  iff  1=  (pi  or  1=  (p2 

—  w  \=  (pi  A  ip2  w  \=  ipi  and  w  \=  (p2 

—  1=  <pi  t/  <p2  iff  there  exists  a  j  >  0  such  that  |=  (p2  and  for  all  0  <  2  <  j, 

—  w  \=  (piR(p2  iS  for  all  j  >  0,  if  for  every  i  <  j  22;^®^  ^  (pi  then  |=  (p2  • 

We  define  some  shorthand  LTL  formulas:  T  =  p  V  -ip  for  some  arbitrary  fixed 
p  G  AP,  ±  =  -nT,  Op  =  {TU(p),  n(p=  {±R(p),  and  (pi  ^  (p2  =  -xpi  V  v?2- 
The  temporal  operators  are  called:  U  for  “until”,  R  for  “release”,  O  for 
“eventually” ,  and  □  for  “globally” .  Some  examples  of  practical  use  of  LTL  for¬ 
mulas  in  specification  are:  □“i(c5i  A  CS2)  (it  always  holds  that  two  processes 
are  not  at  the  same  time  in  a  critical  section),  □(reg  — >  Oack)  (it  is  always 
the  case  that  a  request  is  eventually  followed  by  an  acknowledgement),  and 
((nOschi)  A  (□O5C/12))  — ^  (D(tri  — >  Oc5i))  (if  both  process  1  and  2  are  sched¬ 
uled  infinitely  often,  then  always  the  entering  of  process  1  in  the  trying  section 
is  followed  by  the  process  1  eventually  entering  the  critical  section). 


^  Note  that  we  do  not  define  the  often  used  next-time  operator  X  (p.  This  is  a  tradeoff 
which  allows  the  use  of  step  semantics. 
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Given  a  1-safe  P/T  net  X*,  we  use  a  chosen  subset  of  the  places  as  the  atomic 
propositions  AP.  An  infinite  (interleaving)  execution  Mq  ^  Mi  ^  satisfies 
iff  the  corresponding  cj-word  w  =  {Mq  n  AP),  (Mi  n  AP), . . .  satisfies  (p.  We  say 
that  X  satisfies  ip  iff  every  infinite  execution  starting  from  the  initial  marking  Mq 
satisfies  <p.  Alternatively,  X  does  not  satisfy  (p  if  there  exists  an  infinite  execution 
starting  from  Mq  which  satisfies  ~^(p.  We  call  such  an  execution  a  counterexample. 

The  temporal  logic  LTL  specifies  properties  of  infinite  executions.  In  many 
cases  it  suffices  to  reason  about  simple  temporal  properties.  A  typical  example 
is  the  reachability  of  a  marking  satisfying  some  condition  C  which  roughly  cor¬ 
responds  to  finding  a  counterexample  for  a  formula  □-.(7.  An  important  reach¬ 
ability  based  property  is  deadlock  detection. 

Definition  1.  (Deadlock)  Given  a  1-safe  P/T-net  S,  is  there  a  reachable 
marking  M  which  does  not  enable  any  transition  of  E? 

Most  analysis  questions  including  deadlock  detection  and  LTL  model  check¬ 
ing  are  PSPACE-complete  in  the  size  of  a  1-safe  Petri  net,  see  e.g.,  [6].  In  bounded 
model  checking  we  fix  a  bound  n  and  look  for  counterexamples  which  are  shorter 
than  the  given  bound  n.  For  example,  in  the  case  of  bounded  deadlock  detection 
in  step  semantics  we  look  for  step  executions  reaching  a  deadlock  in  n  steps. 
It  is  easy  to  show  that,  e.g.,  the  bounded  deadlock  detection  problem  in  step 
semantics  is  NP-complete  (when  the  bound  n  is  given  in  unary  coding). 

This  idea  can  also  be  applied  to  LTL  model  checking.  Biere  et.al.  [1]  introduce 
bounded  LTL  model  checking.  They  also  discuss  how  to  ensure  that  a  given 
bound  n  is  sufficient  to  guarantee  completeness.  Unfortunately,  getting  an  exact 
bound  is  often  computationally  infeasible,  and  easily  obtainable  upper  bounds 
are  too  large.  In  the  case  of  1-safe  P/T-nets  they  are  exponential  in  the  number 
of  places  in  the  net.  Therefore  the  bounded  model  checking  results  are  usually 
not  conclusive  if  a  counterexample  is  not  found.  Thus  bounded  model  checking 
is  at  its  best  in  “bug  hunting” ,  and  not  as  easily  applicable  in  verifying  systems 
to  be  correct. 

3  Prom  Bounded  Model  Checking  to  Answer  Set 
Programming 

In  this  section  we  show  how  to  solve  bounded  LTL  model  checking  problems  using 
answer  set  programming.  We  start  with  the  simpler  reachability  properties  and 
then  extend  the  approach  to  handle  full  LTL  model  checking. 

For  encoding  bounded  model  checking  problems  we  use  normal  logic  pro¬ 
grams  with  the  stable  model  semantics  [8].  A  normal  rule  is  of  the  form 

a<— bi,...,bm, not  Cl,..., not  Cn  (1) 

where  each  a,  bi,  Cj  is  a  ground  atom.  We  employ  three  extensions  which  can  be 
seen  as  compact  shorthands  for  normal  rules.  We  use  integrity  constraints,  i.e.. 
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rules  with  empty  head.  Such  a  constraint  like  the  one  on  the  left  can  be  taken 
as  a  shorthand  for  a  rule  given  on  the  right 

6,  not  c  ^  f  ^  not  /,  5,  not  c 

where  /  is  a  new  atom.  For  expressing  the  choice  whether  to  include  an  atom  in 
a  stable  model  we  use  choice  rules.  They  are  normal  rules  where  the  head  is  in 
brackets  with  the  idea  that  the  head  can  be  included  in  a  stable  model  only  if 
the  body  holds  but  it  can  be  left  out,  too.  Such  a  construct  can  be  represented 
using  normal  rules  by  introducing  a  new  atom.  For  example,  the  choice  rule  on 
the  left  corresponds  to  the  two  normal  rules  on  the  right  where  a'  is  a  new  atom. 

{a}  ^  6,  not  c  a  <—  not  6,  not  c 

a'  •(—  not  a 

Finally,  a  compact  encoding  of  conflicts  is  needed,  i.e.,  rules  of  the  form 

<  2{q-i,  . . . ,  a^,}  (2) 

saying  that  a  stable  model  cannot  contain  any  two  atoms  out  of  a  set  of  atoms 
{fli, . . . ,  fln}-  Such  a  rule  can  be  expressed,  e.g.,  by  adding  a  rule  /  ^  not  /,  a^,  aj 
for  each  pair  ai,aj  from  {ai,...,an},  i.e.,  using  O(n^)  rules.  Choice  and  con¬ 
flict  rules  are  simple  cases  of  cairdinality  constraint  rules  [16].  The  Smodels  sys¬ 
tem  (http://www.tcs.hut.fi/Software/smodels/)  provides  an  implementa¬ 
tion  for  cardinality  constraint  rules  and  includes  primitives  supporting  directly 
such  constraints  without  translating  them  first  to  corresponding  normal  rules. 

3.1  Reachability  Checking 

Now  we  devise  a  method  for  translating  bounded  reachability  problems  of  1-safe 
P/T-nets  to  tasks  of  finding  stable  models.  Consider  a  net  N  =  {PyT,  F)  and  a 
step  bound  n  >  1.  We  construct  a  logic  program  i7A(iV,  n),  which  captures  the 
possible  executions  of  iV  up  to  n  steps,  as  follows. 

—  For  each  place  p  €  P,  include  a  choice  rule  {p(0)}  . 

~  For  each  transition  t  G  T,  and  for  alH  =  0, 1, . . . ,  n  -  1,  include  a  rule 

■>-pi(i),...,Pi(j)  (3) 

where  {pi, . . .  ,p/}  is  the  preset  of  t.  Hence,  a  stable  model  can  contain  a 
transition  instance  in  step  i  only  if  its  preset  holds  at  step  i. 

—  For  each  place  p  ^  P,  for  each  transition  tk  in  the  preset  of  p,  and  for  all 
i  —  0, 1, . . . ,  n  —  1,  include  a  rule 

p{i  +  1)  <-  tk{i)  .  (4) 

These  say  that  p  holds  in  the  next  step  if  at  least  one  of  its  preset  transitions 
is  in  the  current  step. 
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pl(i  +  l) 

pl{i  + 1) 

<—  pl(i),  not  t2(i) 

{pl(0)}  - 

{t2(z)}  pl(i),p2(z)  p2{i  +  1)  ^  t4{i) 

p2{i  +  1) 

p2(z),not  t2{i), 

{p2(0)}  ^ 

00 

T 

I 

p3(i  4- 1)  ^  t2{i) 

not  t3(z),  not  t5(z) 

{p3(0)} <- 

{t4{i)}  ^  p4(i) 

p4(i  +  1)  <“  t2(z) 

p3(i  + 1) 

<—  p3(z),  not  tl{i) 

{p4(0)}  ^ 

{t5(i)}  p2(i) 

p4(z  +  1)  i3(z) 

p4(z  + 1) 

p4(z),  not  t4{i) 

{p5(0)}  ^ 

p5(i  +  1)  <—  tb{i) 

p5(z  + 1) 

<-  P5(j) 

<-2{^2(2),t3(z),t5(z)} 

where  i  = 

=  0, 1, . .  .n  —  1 

Fig.  2.  Program  IlA{N^n) 


—  For  each  place  p  £  P,  and  for  alH  =  0, 1, . . . ,  n  —  1,  include  a  rule 

i- (5) 

where  {ti , . . . ,  }  is  the  set  of  transitions  having  each  p  in  their  preset  and 
I  >2.  This  rule  states  that  at  most  one  of  the  transitions  that  are  in  conflict 
w.r.t.  p  can  occur. 

“  For  each  place  p,  and  for  all  z  =  0, 1, . . . ,  n  —  1, 

p{i  +  !)<(—  p(z),  not  ti(z), . . . ,  not  ti{i)  (6) 

where  {ti, . . .  is  the  set  of  transitions  having  p  in  their  preset.  This  is 
the  frame  axiom  for  p  stating  that  p  holds  if  no  transition  using  it  occurs. 

Consider  net  N  in  Fig.  1  for  which  program  nA{N,n)  is  given  in  Fig.  2.  In 
ilA(A?',n)  the  initial  marking  is  not  constrained  but  any  Boolean  combination  C 
of  marking  conditions  can  be  captured  with  a  set  of  rules  [16].  For 

example,  to  eliminate  stable  models  not  satisfying  a  condition  C  at  step  i  saying 
that  M(pi)  =  1  and  (M(p2)  =  0  or  M{pz)  =  1),  it  is  sufficient  to  use  rules 

^  not  c{i)  Cp"2Vp3  (0  ^  not  p2{i) 

c(z)  <  jPi (O’ (^)  ^2Vp3(0  ^  Pzip) 

Our  approach  can  solve  a  reachability  problem  for  a  set  of  initial  markings 
given  by  a  condition  Co  where  the  markings  to  be  reached  are  specified  by 
another  condition  C. 

Theorem  2.  Let  N  =  (P^T^F)  be  a  l-safe  P/T~net  for  all  initial  markings 
satisfying  a  condition  Cq.  Net  N  has  an  initial  marking  satisfying  Cq  such  that 
a  marking  satisfying  a  condition  C  is  reachable  in  at  most  n  steps  ijfnuiCo,  0)U 
nA{Nj  n)  U  i7M(C,  n)  has  a  stable  model 

The  deadlock  detection  problem  is  now  just  a  special  case  of  a  reachability 
property,  just  add  rules  TTm  (C,n)  =  i7D(iV,  n)  eliminating  stable  models  where 
some  transition  is  enabled.  Program  i7D(iV',  n)  includes  for  each  transition  t  gT 
and  its  preset  {pi, . . .  ,pf},  a  rule 

^pi(n),...,pz(n)  .  (7) 

For  our  running  example,  the  rules  77D(iV',n)  are 

p3(n)  <— pl(n),p2(n)  p2(n) 


p4(n)  . 
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3.2  Bounded  LTL  Model  Checking 

Our  strategy  for  finding  counterexamples  for  LTL  formula  (p  (i.e.,  executions 
satisfying  -xp)  is  exactly  the  same  as  in  [1].  There  it  is  shown  to  be  an  approx¬ 
imation  of  the  unbounded  version  which  becomes  equivalent  to  the  unbounded 
case  if  the  bound  used  is  sufficiently  increased.  We  (as  they  do)  require  that  all 
reachable  states  of  the  system  have  a  successor  (i.e.,  there  are  no  deadlocks).  In 
this  case  the  reachability  of  a  marking  satisfying  a  condition  C  is  equivalent  to 
finding  a  counterexample  for  an  LTL  formula  of  the  form  D-nC. 

We  look  for  two  different  kinds  of  counterexamples.  On  the  left  in  Fig.  3  is 
a  loop  counterexample,  and  on  the  right  is  a  counterexample  without  loop.  Loop 
counterexamples  specify  an  infinite  execution  themselves,  while  counterexamples 
without  a  loop  specify  a  prefix  of  an  execution,  which  can  be  always  extended 
to  an  infinite  execution  (by  the  deadlock  freeness  assumption).  The  arcs  of  the 
figure  denote  the  “next  state”  of  each  state.  Notice  in  the  loop  counterexample 
that  if  is  equivalent  to  the  last  state  M„,  the  state  Mi  is  the  “next  state” 

of  Mn.  Our  semantics  is  cautious  in  the  case  without  loop,  and  extending  the 
execution  into  an  infinite  one  in  any  way  will  yield  a  counterexample.^ 


=  Mn 

/  il{i)  il(i  +  1)  \it(n) 


Fig.  3.  Two  counterexample  possibilities 


An  LTL  formula  is  said  to  be  in  positive  normal  form  when  all  negations 
in  the  formula  appear  directly  before  an  atomic  proposition.  A  formula  can  be 
put  into  positive  normal  form  with  the  following  equivalences  (and  their  duals): 
~-i~xp  =  (p,  -»((^i  V  (P2)  =  -'(pi  A  ->(^2?  and  ->{(pi  U  (p2)  =  ~'Pi  R~>P2‘ 

Given  an  LTL  formula  /  in  positive  normal  form  (when  the  formula  to  be 
model  checked  is  (p,  the  formula  /  is  equivalent  to  ~t(p  with  negations  pushed  in), 
and  a  bound  n  >  1  we  construct  a  program  i7LTL(/?  as  follows. 

—  Guess  which  state  is  equivalent  to  the  last.  For  all  0  <  i  <  n  —  1  add  rule 

{eim  -  .  (8) 

—  Disallow  guessing  two  or  more.  (Guessing  none  is  allowed  though.)  Add  rule 

^  2{e^(0),  e/(l), . . . ,  el{n  ~  1)}  .  (9) 


^  Actually  the  counterexamples  without  loop  are  exactly  the  informative  safety  coun¬ 
terexamples  of  [13]. 
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Formula  type 

Translation 

Formula  type 

Translation 

p,  for  p  G  AP 

f{i)  <-  p(i) 

for  p  6  AP 

f{i)  not  p{i) 

/i  V  /2 

/(i)  <-/i(i) 

/(*)  ^  /2(0 

h  A  /2 

hUh 

f{i)^  h{i) 

/(*)  *-  /i(4/(*  +  1) 

/(n  +  1)  <-n/(i),/(i) 

flRf2 

fii)  ^  Mi)Ji{i) 
/(0--/2(2),/(2  4-1) 

/(n  +  1)  ^n/(2),/(2) 

/(n  +  1)  not  cstate{f) 

cstate(f)  <r~  il(i),  not  f2{i) 

Fig.  4.  Translation  of  an  LTL  formula  / 


-  Check  that  the  guess  is  correct.  For  all  0  <  i  <  n  ~  1,  p  £  P  include  rules 

^  el{i) ^p{i)^  not  p{n)  <—  el {i),p(n),  not  p{i)  . 

-  Specify  auxiliary  loop  related  atoms.  For  all  0  <  2  <  n  —  1,  include  rules 

I  <—  el{i)  nl{i  +  1)  el{i)  il{i  +  1)  ^  el{i)  il{i  +  1)  ^  il(i)  . 

See  Fig.  3  for  an  example.  The  nl{i)  atom  is  in  a  model  for  the  “next  state” 
of  the  last  state,  while  il{i)  is  in  the  model  for  all  states  in  the  loop. 

~  Require  that  if  a  loop  exists,  the  last  step  contains  a  transition  to  disallow 
looping  by  idling.  Add  the  rule 

Z,  not  ti(n  -  1), . . . ,  not  tk{n  -  1)  (10) 

where  {ti, . . . ,  tfc}  ='P^  i.e.,  the  set  of  all  transitions. 

-  Allow  at  most  one  visible  transition  in  a  step  to  eliminate  steps  which  cannot 
be  interleaved  to  yield  a  counterexample.  For  all  0  <  2  <  n  -  1,  add  rule 

^2{ti(2),...,4(2)}  (11) 

where  {ti, . . .  ,tfc}  is  the  set  of  visible  transitions,  i.e.,  the  transitions  whose 
firing  changes  the  marking  of  a  place  p  appearing  in  the  formula  /. 

We  recursively  translate  the  formula  /  by  first  translating  its  subformulae,  and 
then  /  as  follows.  For  all  0  <  2  <  n,  add  the  rules  given  by  Fig.  4.^  Finally  we 
require  that  the  top  level  formula  /  should  hold  in  the  initial  marking 

not  /(O)  .  (12) 

With  this  program  f7LTL(/5  n)  we  get  our  main  main  result. 

Theorem  3.  Let  f  be  an  LTL  formula  in  positive  normal  form  and  N=  (P,  T,  F) 
be  a  1-safe  and  deadlock  free  P/T-net  for  all  initial  markings  satisfying  a  con¬ 
dition  Cq.  If  IIm{Co,0)  UllA{N,n)  Ui7LTL(/,’^)  has  a  stable  model,  then  there 
is  an  execution  of  N  from  an  initial  marking  satisfying  Cq  which  satisfies  f. 

^  An  equivalence  explaining  the  release  translation:  /i  P/2  =  (D/2)  V  (/2  U  (/2  A  /i)). 
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The  size  of  the  program  in  Theorem  3  is  linear  in  the  size  of  the  net  and 
formula,  i.e.,  0{{\P\  +  |r|  +  |jF|  +  |/|)  •  n).  The  semantics  of  LTL  is  defined  over 
interleaving  executions.  A  novelty  of  the  translation  is  that  it  allows  concurrency 
between  invisible  transitions. 

Forcing  interleaving  semantics.  We  can  create  the  interleaving  semantics  ver¬ 
sions  of  bounded  model  checking  problems  by  adding  a  set  of  rules  n).  It 

includes  for  each  time  step  0<i<n  —  la  rule 

(13) 

where  {ti,..-<,tm}  is  the  set  of  all  transitions.  These  rules  eliminate  all  stable 
models  having  more  than  one  transition  firing  in  a  step. 

Corollary  1.  Let  ns{N,  n)  be  a  program  solving  a  bounded  model  checking  prob¬ 
lem  in  the  step  semantics  using  any  of  the  translations  above.  Then  the  program 
i7s(iV,n)  Ui7i(iV,n)  solves  the  same  problem  in  the  interleaving  semantics. 

3.3  Relation  to  Previous  Work 

In  previous  work  on  bounded  model  checking  little  attention  has  been  given 
to  the  knowledge  representation  problem  of  encoding  succinctly  the  unfolded 
behavior  and  the  temporal  property.  We  address  this  problem  and  develop  an 
encoding  of  the  behavior  of  an  asynchronous  system  which  is  linear  in  the  size  of 
the  system  description  (Petri  net)  and  in  the  number  of  steps.  Moreover,  it  allows 
the  exploitation  of  the  inherent  concurrency  of  the  system  in  model  checking. 

Our  approach  could  be  used  as  a  basis  for  a  similar  treatment  using  propo¬ 
sitional  logic  and  satisfiability  (SAT)  checkers.  For  simple  temporal  properties 
such  as  reachability  and  deadlock  this  is  fairly  straightforward  to  develop  us¬ 
ing  the  ideas  of  Clark’s  completion  and  Fages’  theorem  [7].  This  is  because  our 
encoding  produces  acyclic  programs  except  for  the  choice  rules  which  need  a 
special  treatment.  To  achieve  a  compact  SAT  encoding  is  more  challenging  be¬ 
cause  propositional  logic  lacks  cardinality  constraint  rules  (2).  Their  mapping 
to  propositional  formulae  can  result  to  a  quadratic  blow-up  which  is  sometimes 
significant  as  conflicts  may  involve  even  hundreds  of  transitions. 

For  general  LTL  model  checking  a  succinct  SAT  encoding  is  challenging.  The 
compactness  of  our  encoding  is  due  to  the  fact  that  stable  model  semantics  sup¬ 
ports  the  smallest  fixed  point  evaluation  of  recursive  rules  which  is  exploited  in 
translating  the  U  and  R  operators.  Because  of  these  recursive  rules  a  similar  com¬ 
pact  SAT  encoding  is  not  immediate.  In  [1]  a  SAT  encoding  is  given.  However, 
it  is  more  complicated  than  our  linear  size  encoding  but  remains  polynomial. 

4  Experiments 

We  have  implemented  the  deadlock  detection  and  LTL  model  checking  transla¬ 
tions  presented  in  the  previous  section.  The  translation  is  given  a  fixed  initial 
marking  Mq,  which  allows  the  following  optimizations  to  be  implemented: 


210 


Keijo  Heljanko  and  Ilkka  Niemela 


Problem 

m 

■BD 

EiSI 

States 

DARTES(l) 

■fgM 

KS 

0.5 

CgnitiriTiTiTn 

DP  (6) 

36 

24 

0.0 

6 

BB 

728 

DP(8) 

48 

32 

■1 

0.0 

8 

6560 

DP  (10) 

60 

40 

■ 

0.0 

10 

59048 

DP(12) 

72 

48 

■d 

0.0 

12 

531440 

ELEV(l) 

63 

4 

0.0 

■■g] 

0.4 

163 

ELEV(2) 

146 

ii 

6 

0.5 

12 

3.9 

1092 

ELEV(3) 

327 

783 

8 

5.6 

15 

139.0 

7276 

ELEV(4) 

736 

1939 

10 

157-2 

>13 

1215.2 

48217 

HART(25) 

127 

77 

1 

0.0 

>5 

1.0 

>1000000 

HART(50) 

252 

152 

1 

0.0 

>5 

5.7 

>1000000 

HART(75) 

377 

227 

1 

0.0 

>5 

15.5 

>1000000 

HART(IOO) 

502 

302 

1 

0.0 

>5 

35.9 

>1000000 

Egg 

536 

172 

7 

11.1 

10 

87.2 

7702 

■Ba 

232 

8 

687.3 

>11 
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Fig.  5.  Experiments 


—  Place  and  transition  atoms  are  added  only  from  the  time  step  they  can  first 
appear  on.  Only  atoms  for  places  p(0)  in  the  initial  marking  are  created 
for  time  i  =  0.  Then  for  each  0  <  i  <  n  —  1:  (i)  Add  transition  atoms  for 
all  transitions  t{i)  such  that  all  the  place  atoms  in  the  preset  of  t{i)  exist, 
(ii)  Add  place  atoms  for  all  places  p{i  +  l)  such  that  either  the  place  atom 
p{i)  exists  or  some  transition  atom  in  the  preset  of  p{i  +  1)  exists. 

—  Duplicate  rules  are  removed.  Duplicates  can  appear  in  (5), (7). 

As  benchmarks  we  use  a  set  of  deadlock  detection  benchmarks  collected  by 
Corbett  [4],  converted  to  1-safe  P/T-nets  by  Melzer  and  Roraer  [14].  The  models 
were  picked  from  those  which  have  a  deadlock.  For  each  model  and  both  seman¬ 
tics  we  incremented  the  used  bound  until  a  deadlock  was  found.  We  report  the 
time  for  Smodels  to  find  the  first  stable  model  using  this  bound.  In  some  cases 
a  model  could  not  be  found  within  a  reasonable  time  in  which  case  we  report 
the  time  used  to  prove  that  there  is  no  deadlock  within  the  reported  bound.  Un¬ 
fortunately,  we  did  not  have  a  large  collection  of  LTL  model  checking  examples, 
and  benchmarking  the  LTL  translation  is  left  for  further  work.  The  experimental 
results  can  be  found  in  Fig.  5.  The  columns  are: 

—  Problem:  The  problem  name  with  the  size  of  the  instance  in  parenthesis. 

—  |P|:  Number  of  places  in  the  original  net. 

—  \T\:  Number  of  transitions  in  the  original  net. 

~  St.  n:  The  smallest  integer  n  such  that  a  deadlock  could  be  found  using  the 
step  semantics  /  in  case  of  >  n  the  largest  integer  n  for  which  we  could  prove 
that  there  is  no  deadlock  within  that  bound  using  the  step  semantics. 

—  St.  s:  The  time  in  seconds  to  find  the  first  stable  model  /  to  prove  that  there 
is  no  stable  model.  (See  St.  n  above.) 

—  Int.  n  and  Int.  s:  defined  as  St.  n  and  St.  s  but  for  the  interleaving  semantics. 

—  States:  Number  of  reaohable  states  of  the  P/T-net  (if  known). 

These  differ  from  the  ones  reported  in  [11]  where  unfortunately  there  are  some  errors. 
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The  times  reported  are  the  average  of  5  runs  of  the  time  for  smodels  2.26 
as  reported  by  the  /usr /bin/time  command  on  a  450Mhz  Pentium  III  PC 
running  Linux.  The  used  tools,  nets,  and  logic  programs  are  available  from: 
<http :  //www .  tcs .  hut .  f  i/"'kepa/experiments/LPNMR2001/> . 

In  many  of  the  experiments  the  step  semantics  version  found  a  deadlock  with 
a  smaller  bound  than  the  interleaving  one.  Also,  when  the  bound  needed  to  find 
the  deadlock  was  fairly  small,  the  bounded  model  checker  was  performing  well. 
In  the  examples  ELEV(4),  HART(x)  and  Q(l)  we  were  able  to  find  the  coun¬ 
terexample  only  when  using  step  semantics.  In  the  KEY(2)  example  we  were 
not  able  to  find  a  counterexample  with  either  semantics,  even  though  the  prob¬ 
lem  is  known  to  have  only  a  small  number  of  reachable  states.  In  contrast,  the 
DARTES(l)  problem  has  a  large  state-space,  and  despite  of  it  a  counterexample 
of  length  32  was  obtained.  Overall,  the  results  are  promising,  in  particular,  for 
small  bounds  and  the  step  semantics. 


5  Conclusions 

We  introduce  bounded  model  checking  of  asynchronous  concurrent  systems  mod¬ 
eled  by  1-safe  P/T-nets  as  an  interesting  application  area  for  answer  set  program¬ 
ming.  We  present  mappings  from  bounded  reachability,  deadlock  detection  and 
LTL  model  checking  problems  of  1-safe  P/T-nets  to  stable  model  computation. 
Our  approach  is  capable  of  doing  model  checking  for  a  set  of  initial  markings  at 
once.  This  is  usually  difiicult  to  achieve  in  current  enumerative  model  checkers 
and  often  leads  to  state  space  explosion.  We  handle  asynchronous  systems  using 
a  step  semantics  whereas  previous  work  on  bounded  model  checking  only  uses 
the  interleaving  semantics  [1] .  Furthermore,  our  encoding  is  more  compact  than 
the  previous  approach  employing  propositional  satisfiability  [1].  This  is  because 
our  rule  based  approach  allows  to  represent  executions  of  the  system,  e.g.  frame 
axioms,  succinctly  and  supports  directly  the  recursive  fixed  point  computation 
needed  to  evaluate  LTL  formulae. 

The  first  experimental  results  indicate  that  stable  model  computation  is  quite 
a  competitive  approach  to  searching  for  short  executions  of  the  system  leading 
to  deadlock  and  worth  further  study.  More  experimental  work  and  comparisons 
are  needed  to  determine  the  strength  of  the  approach.  In  particular,  for  compar¬ 
ing  with  SAT  checking  techniques,  it  would  be  interesting  to  develop  a  similar 
treatment  of  asynchronous  systems  using  a  SAT  encoding  and  compare  it  to  the 
logic  program  based  approach. 

Relating  the  net  unfolding  method  (see  [9,14]  and  further  references  there) 
to  bounded  model  checking  would  be  interesting.  There  are  also  alternative  se¬ 
mantics  to  the  two  presented  in  this  work  [10],  applying  them  to  bounded  LTL 
model  checking  is  left  for  further  work. 
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Abstract.  In  this  paper  we  suggest  an  architecture  for  a  software  agent 
which  operates  a  physical  device  and  is  capable  of  making  observations 
and  of  testing  and  repairing  the  device  components.  We  present  novel 
definitions  of  the  notions  of  symptom,  candidate  diagnosis,  and  diagnosis 
which  are  based  on  the  theory  of  action  language  AC.  The  new  definitions 
allow  one  to  give  a  simple  account  of  the  agent’s  behavior  in  which  many 
of  the  agent’s  tasks  are  reduced  to  computing  stable  models  of  logic 
programs. 


1  Introduction 

In  this  paper  we  continue  the  investigation  of  applicability  of  A-Prolog  (a  loosely 
defined  collection  of  logic  programming  languages  under  the  answer  set  seman¬ 
tics  [6] )  to  knowledge  representation  and  reasoning.  The  focus  is  on  the  develop¬ 
ment  of  an  architecture  for  a  software  agent  acting  in  a  changing  environment. 
We  assume  that  the  agent  and  the  environment  (sometimes  referred  to  as  a 
dynamic  system)  satisfies  the  following  simplifying  conditions. 

1.  The  agent’s  environment  is  viewed  as  a  transition  diagram  whose  states  are 
sets  of  fluents  (relevant  properties  of  the  domain  whose  truth  values  may 
depend  on  time)  and  whose  arcs  are  labeled  by  actions. 

2.  The  agent  is  capable  of  making  correct  observations,  performing  actions,  and 
remembering  the  domain  history. 

These  assumptions  hold  in  many  realistic  domains  and  are  suitable  for  a  broad 
class  of  applications.  In  many  domains,  however,  the  effects  of  actions  and  the 
truth  values  of  observations  can  only  be  known  with  a  substantial  degree  of 
uncertainty  which  cannot  be  ignored  in  the  modeling  process.  It  remains  to  be 
seen  if  some  of  our  methods  can  be  made  to  work  in  such  situations.  The  above 
assumptions  determine  the  structure  of  the  agent’s  knowledge  base.  It  consists 
of  three  parts.  The  first  part^  called  an  action  (or  system)  description,  specifies 
the  transition  diagram  representing  possible  trajectories  of  the  system.  It  con¬ 
tains  descriptions  of  domain’s  actions  and  fluents,  together  with  the  definition 
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of  possible  successor  states  to  which  the  system  can  move  after  an  action  a  is 
executed  in  a  state  a.  The  second  part  of  the  agent’s  knowledge,  called  history 
description,  contains  observations  made  by  the  agent  together  with  a  record  of 
its  own  actions.  It  defines  a  collection  of  paths  in  the  diagram  which  can  be  inter¬ 
preted  as  the  system’s  possible  pasts.  If  the  agent’s  knowledge  is  complete  (i.e.,  it 
has  complete  information  about  the  initial  state  and  the  occurrences  of  actions) 
and  the  system’s  actions  are  deterministic  then  there  is  only  one  such  path.  The 
third  part  of  agent’s  knowledge  base  contains  a  collection  of  the  agent’s  goals. 
All  this  knowledge  is  used  and  updated  by  the  agent  who  repeatedly  executes 
the  following  steps: 

1.  observe  the  world  and  interpret  the  observations; 

2.  select  a  goal; 

3.  plan; 

4.  execute  part  of  the  plan. 

In  this  paper  we  concentrate  on  agents  operating  physical  devices  and  capable 
of  testing  and  repairing  the  device  components.  We  are  especially  interested  in 
the  first  step  of  the  loop,  i.e.  in  agent’s  interpretations  of  discrepancies  between 
agent’s  predictions  and  the  system’s  actual  behavior.  The  following  example  will 
be  used  throughout  the  paper: 

Example  1.  Consider  a  system  S  consisting  of  an  analog  circuit  AC  from  figure 
1.  We  assume  that  switches  Si  and  S2  are  mechanical  components  which  cannot 
become  damaged.  Relay  r  is  a  magnetic  coil.  If  not  damaged,  it  is  activated 
when  Si  is  closed,  causing  S2  to  close.  Undamaged  bulb  b  emits  light  if  S2  is 
closed.  For  simplicity  we  consider  an  agent  capable  of  performing  only  one  ac¬ 
tion,  close{si).  The  environment  can  be  represented  by  two  damaging  exogenous 
actions:  brk,  which  causes  6  to  become  faulty,  and  srg,  which  damages  r  and 
also  b  assuming  that  b  is  not  protected.  Suppose  that  the  agent  operating  this 
device  is  given  a  goal  of  lighting  the  bulb.  He  realizes  that  this  can  be  achieved 
by  closing  the  first  switch,  performs  the  operation,  and  discovers  that  the  bulb 
is  not  lit.  The  goal  of  the  paper  is  to  specify  the  agent’s  behavior  after  this 
discovery. 
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We  start  with  presenting  our  definitions  of  the  notions  of  symptom,  candidate 
diagnosis,  and  diagnosis  which  are  based  on  the  theory  of  action  language  AL  [1]. 
These  definitions  are  used  to  give  a  simple  account  of  the  agent’s  behavior  in 
which  many  of  the  agent’s  tasks  are  reduced  to  computing  stable  models  of  logic 
programs. 


Background 

By  a  physical  system  S  we  mean  a  triple  {C7,  F,  A)  of  finite  sets.  Elements  of  C  are 
called  components  of  S.  Elements  of  F  are  referred  to  as  fluents.  By  fluent  literals 
we  mean  fluents  and  their  negations  (denoted  by  -i/).  The  set  A  of  elementary 
actions  is  partitioned  into  two  disjoint  sets,  Ag  and  Ae\  As  consists  of  actions 
performed  by  an  agent  and  A^  consists  of  exogenous  actions  whose  occurrence 
can  cause  system  components  to  malfunction. 

A  system  S  will  be  associated  with  the  transition  diagram  T{S)  (or  simply  T). 
States  of  T  are  labeled  by  complete  and  consistent  sets  of  fluent  literals  corre¬ 
sponding  to  possible  physical  states  of  5.  The  arcs  are  labeled  by  subsets  of  A 
called  compound  actions.  Execution  of  a  compound  action  {ai,...,afc}  corre¬ 
sponds  to  the  simultaneous  execution  of  its  components.  Paths  of  T  correspond 
to  possible  behaviors  (or  trajectories)  of  S.  To  reason  about  S  we  need  to  have  a 
concise  and  convenient  way  to  define  its  transition  diagram.  This  will  be  done  by 
a  system  description  SD{S)  (or  simply  SD)  consisting  of  rules  of  A-Prolog  defin¬ 
ing  components  of  5,  its  fluent  and  actions,  causal  laws  determining  the  efiects  of 
these  actions,  and  the  actions’  executability  conditions.  We  assume  that  SD  has 
a  unique  answer  set  which  defines  an  action  description  of  AC.  (In  our  further 
discussion  we  will  identify  this  action  description  with  SD.)  Causal  laws  oi  SD 
can  be  divided  into  two  parts.  The  first  part,  SDn^  contains  laws  describing  nor¬ 
mal  behavior  of  the  system.  Their  bodies  usually  contain  special  fluent  literals 
of  the  form  ^ah{c).  As  usual  ah{c)  is  read  as  “component  c  of  5  is  abnormal”.  Its 
use  in  diagnosis  goes  back  to  [15].  The  second  part,  SD^,,  describes  effects  of  ex¬ 
ogenous  actions  damaging  the  components.  Such  laws  normally  contain  relation 
ah  in  the  head  or  positive  parts  of  the  bodies. 

In  addition  to  describing  all  possible  trajectories  of  5,  we  need  to  describe  the 
history  of  5  up  to  a  current  moment  n.  This  is  done  by  a  collection  of 
statements  in  the  ‘history  description’  part  of  AC.  We  assume  that  the  system’s 
time  is  discrete  and  U  and  U+i  stand  for  two  consecutive  moments  of  time  in 
the  interval  0  . . .  n.  Statements  of  Fn  have  the  form: 

1.  o6s(Z,t)  -  ‘fluent  literal  I  was  observed  to  be  true  at  moment  F; 

2.  hpd{a^  t)  -  elementary  action  a  G  A  was  observed  to  happen  at  moment  t 

where  0  <  t  <  n.  For  simplicity  we  only  consider  histories  with  observations 
closed  under  the  static  causal  rules  of  AC,  (i.e.  if  every  state  of  S  must  satisfy  a 
constraint  ‘fluent  literal  Iq  is  true  if  fluent  literals  from  P  are  true’  and  literals 
from  P  are  observed  in  P  then  so  must  be  Iq).  Let  5  be  a  system  with  the 
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transition  diagram  T  and  let  Fn  be  a  history  of  S  up  to  moment  n.  A  path 
crij  •  •  • ,  ttn-i,  <T„  in  T  is  a  model  of  Fn  iff 

1.  ak  =  {a:  hpd{a,  k)  E  Fn}] 

2.  if  obs{l^  k)  €  Fn  then  I  £  (Xk- 

Fn  is  consistent  (with  respect  to  T)  if  it  has  a  model.  A  fluent  literal  I  holds 
in  a  model  M  at  time  k  <  n  {M  h(l,k))  if  I  £  <7^.  Finally,  Fn  \=  h{l,k) 
if,  for  every  model  M  of  M  |=  h{l,k).  Notice  that,  in  contrast  to  action 
description  language  £  from  [2],  [3]  a  domain  description  of  AC  is  consistent 
only  if  changes  in  the  observations  of  system’s  states  can  be  explained  without 
assuming  occurrences  of  any  action  not  recorded  in  Fn- 

The  following  is  a  description,  SD,  of  system  S  from  Example  1: 

Fluents: 

comp{r).  comp{b).  switch{si).  switch{s2) 

f{active{r)).  f{on{b)),  f{prot{b)). 

f  lclosedlsW))  ^  switch{SW). 

flab{X))  comp{X). 

Agent  Actions:  Exogenous  Actions 
a.act{close{si)).  x-act(brk). 

X-act{srg). 

Causal  Laws  and  Executability  Conditions  describing  normal  functioning  of  S: 

causes{close{si)^closed{si),  []). 
caused{active{r),  [closed{si)^-iab{r)]). 
caused{closed{s2)i  [active{r)]) . 
caused{on{b)^  [closed{s2)^  “>a6(6)]). 
caused{-^on{b),  [^closed{s2)]). 
impossibles f{close{si),  [closed{si)]). 

(causes (A,  L,  P)  says  that  execution  of  action  A  in  a  state  satisfying  fluent  liter¬ 
als  from  P  causes  fluent  literal  L  to  become  true  in  a  resulting  state;  caused{L,  P) 
means  that  every  state  satisfying  P  must  also  satisfy  L,  impossibleSf{A,  P)  in¬ 
dicates  that  action  A  is  not  executable  in  states  satisfying  P.)  The  system’s 
malfunctioning  information  is  given  by: 

!causes{brk,ab{b),  []). 
causes{srg,  a6(r),  []). 
causes{srg^  a6(6),  [->prot{b)]). 
caused{->on{b)^  [a6{6)]). 
caused{^active{r),  [a6(r)]). 

Now  consider  a  history,  Fq  of  5: 
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'  hpd{close{si),0). 
obs{-^dosed{si)^0). 
obs{-iclosed{s2),0). 
^  obs{~>ab{b),0). 
o6s(->a6(r),0). 

^  obs{prot{b)yO). 


It  is  easy  to  see  that  the  path  {ao,  dose(si),  ai)  is  the  only  model  of  Fq  and  that 
Fo  1=  /i(on(6),l) 


2  Basic  Definitions 

Let  5  be  a  system  with  the  transition  diagram  T,  n  be  a  moment  of  time,  On 
be  a  collection  of  observations  made  by  the  agent  starting  at  n,  and  Fn-i  be  the 
previous  history  of  S.  We  say  that  a  pair 

S=(Fn^uOn)  (1) 

is  a  symptom  of  the  system’s  malfunctioning  if  Fn-i  is  consistent  (w.r.t.  T)  and 
Fn-iUOn  is  not.  Our  definition  of  a  candidate  diagnosis  of  symptom  (1)  is  based 
on  the  notion  of  explanation  from  [1],  In  our  terminology,  an  explanation,  of 
symptom  (1)  is  a  collection  of  statements 

E  =  {hpd{ai,t)  :0  <t  <n  and  ai  G  Ae}  (2) 

such  that  Tn- 1  U  On  U  ^  is  consistent. 

Definition  1.  A  candidate  diagnosis  D  of  symptom  (1)  consists  of  an  explana¬ 
tion  E{D)  of  (1)  together  with  the  set  A{D)  of  components  of  S  which  could 
possibly  be  damaged  by  actions  from  E{D).  More  precisely,  A{D)  =  {c  :  M  |= 
h{ab{c)^n  —  1)}  for  some  model  M  of  Fn-i  U  On  U  E{D). 

Definition  2.  We  say  that  a  diagnosis  of  a  symptom  S  —  (A-ijOn)  is  a  can¬ 
didate  diagnosis  in  which  all  components  in  A  are  faulty. 


3  Computing  Candidate  Diagnoses 

In  this  section  we  show  how  the  need  for  diagnosis  can  be  determined  and  can¬ 
didate  diagnoses  found  by  the  techniques  of  answer  set  programming  [10]. 

Consider  a  system  description  SD  of  S  whose  behavior  up  to  the  moment  n  —  1 
from  some  interval  [0,W]  is  described  by  history  Fn-i.  (We  assume  that  N  is 
sufficiently  large  for  our  application.)  We  start  by  describing  an  encoding  of 
SD  into  programs  of  A-Prolog  suitable  for  execution  by  SMODELS  [14].  Since 
SMODELS  takes  as  an  input  programs  with  finite  Herbrand  bases,  references  to 
lists  should  be  eliminated  from  SD.  To  do  that  we  expand  the  signature  of  SD 
by  new  terms  -  names  of  the  corresponding  causal  laws  -  and  consider  a  mapping 
a  defined  as  follows: 
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1.  a{cau$€s{aJo,[li . .  .Irn]))  is  the  collection  of  atoms  dJaw{d),  head{djQ), 
action{d^a),  prec{d,iji)  for  1  <  z  <  m,  and  prec{d,m  +  l,mZ)  (Here  and 
below  d  will  refer  to  the  name  of  the  corresponding  law). 

2.  a{caused{lo^  [^i  •  •  •  Im]))  is  the  collection  of  atoms  sJaw{d),  head{d^  Iq), 
prec{d,  iji)  hr  1  <  i  <  m,  and  prec{d,  m  +  1,  nil). 

3.  a{impossibleJf{a,  [h. ..  ?m]))  is  a  constraint 

o(a,r). 

where  o(a,  t)  stands  for  action  a  occurred  at  time  t. 

By  a{SD)  we  denote  the  result  of  applying  a  to  the  laws  of  SD.  Finally,  for  any 
history,  F,  of  5 

a{SD,r)^nua{SD)ur 

where  iJ  is  defined  as  follows: 


1.  h{L,T') 

dJaw{D), 

head{D^  L), 

action  (77,  A), 

o(A,T), 

prec-h{D,T). 

2.  h{L,T) 

sJaw(D)y 

head{D,L)^ 

prec.h(D^T). 

3.  allJi{D,N,T) 

prec{D,N,  nil). 

4.  allJi{D,N,T) 

prec{D,N,  P), 

h(P,T), 

alLh{D,N',T). 

5.  precJi{D,  T) 

^  alLh{D,l,T). 

6.  h{L,T') 

^h(L,n 

not  h{L,T'). 

7.  o{A,T) 

^  hpd{A,T). 

8. 

<—  o6s(L,  0). 

9. 

^  obs{L,  r). 

not  h{L,T). 

Here  D,  A,  L  are  variables  for  the  names  of  laws,  actions,  and  fluent  literals 
respectively,  T,  T'  denote  consecutive  time  points  from  the  interval  [0,  iV],  and 
iV,  N'  are  variables  for  consecutive  integers.  (The  corresponding  typing  predi¬ 
cates  in  the  bodies  of  some  rules  of  77  are  omitted  to  save  space;  o  is  used  instead 
of  hpd  to  distinguish  between  actions  observed  and  actions  hypothesized).  The 
following  terminology  will  be  useful  for  describing  the  relationship  between  an¬ 
swer  sets  of  a(SD,  Fn-i)  and  models  of  r^_i. 

We  say  that  an  answer  set  AS  of  a{SD,rn-i)  defines  the  trajectory 
p  =  o-o,ao,cri, . . .  ,an_2,<Tn-i  where  Gk  =  {I  h{fk)  G  .4<S}  and  ak  =  {a  : 
o(ci,  k)  G 
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The  following  theorem  establishes  the  relationship  between  the  theory  of  actions 
in  AC  and  logic  programming. 

Theorem  1.  If  the  initial  situation  of  Fn-i  is  complete,  i.e.  for  any  fluent  f 
of  SD,  Fn-i  contains  obs{f,0)  or  o6s(-</,0)  then  M  is  a  model  of  Fn-i  iff  M 
is  a  trajectory  defined  by  some  answer  set  of  a{SD,Fn-i)- 

(The  theorem  is  similar  to  the  result  from  [18]  which  deals  with  a  different 
language  and  uses  the  definitions  from  [11]). 

Now  let  «S  be  a  symptom  of  the  form  (1),  and  let 

TEST{S)  =  a{SD, Fn-i)  U  0„  U  i?  (3) 


where 

f  obs{f,  0)  not  obs{-yf,  0). 

\  obs{~>f^  0)  -f—  not  obs{f,  0). 

for  any  fluent  /  E  F,  The  rules  of  R  are  sometimes  called  the  awareness  axioms. 
They  guarantee  that  initially  the  agent  considers  all  possible  values  of  the  do¬ 
main  fluents.  (If  the  agent’s  information  about  the  initial  state  of  the  system  is 
complete  these  axioms  can  be  omitted.)  The  following  corollary  forms  the  basis 
for  our  diagnostic  algorithms. 

Corollary  1.  Let  S  =  {Fn-i,On)  where  Fn~i  is  consistent.  Then  S  is  a  symp¬ 
tom  of  system ’s  malfunctioning  iff  the  program 
TEST{S)  has  no  answer  set 

To  diagnose  the  system,  5,  we  construct  a  program,  DM^  defining  an  expla¬ 
nation  space  of  our  diagnostic  agent  -  a  collection  of  sequences  of  exogenous 
events  which  could  happen  (unobserved)  in  the  system’s  past  and  serve  as  pos¬ 
sible  explanations  of  unexpected  observations.  We  call  such  programs  diagnostic 
modules  for  S.  The  simplest  diagnostic  module,  DMq,  is  defined  by  rules: 


fo(A,T) 


DMo  { 


->o(i4,T) 


0  <  T  <  n,  xMct{A), 
not  “to(A,  T). 

0  <  T  <  n,  x-act{A), 
not  o(A,  T). 


or,  in  the  more  compact,  choice  rule^  notation  of  SMODELS  ([16]) 
{o{A,T)  :  x-act{A)}  0  <  T  <  n. 


(Recall  that  a  choice  rule  has  the  form 

m{p{X)  :  q{X)}n  ^  body 

and  says  that,  if  the  body  is  satisfied  by  an  answer  set  AS  of  a  program  then 
AS  must  contain  between  m  and  n  atoms  of  the  form  p(t)  such  that  q(t)  E  AS.) 
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Finding  candidate  diagnoses  of  symptom  (1)  can  be  reduced  to  finding  answer 
sets  of  a  diagnostic  program 


Vo{S)  =  TEST{S)  U  DMo 


(4) 


It  is  not  difiicult  to  see  that  DMq  generates  every  possible  sequence  of  the  past 
occurrences  of  exogenous  actions  and  hence,  by  Theorem  1,  'Dq{S)  finds  all  the 
candidate  diagnoses  of  S. 

Example  2.  Let  us  again  consider  system  S  from  Example  1.  According  to  Fq 
initially  the  switches  si  and  S2  are  open,  all  circuit  components  are  ok,  si  is 
closed  by  the  agent,  and  b  is  protected.  It  is  predicted  that  b  will  be  on  at 
1.  Suppose  that,  instead,  the  agent  observes  that  at  time  1  bulb  b  is  off,  i.e. 
Oi  =  {o6s(“ion(6),  1)}.  Intuitively,  this  is  viewed  as  a  symptom  5o  =  (Fq^Oi) 
of  malfunctioning  of  5.  By  running  SMODELS  on  TEST{So)  we  discover  that 
this  program  has  no  answer  sets  and  therefore,  by  corollary  1,  Sq  is  indeed  a 
symptom.  Diagnoses  of  «So  can  be  found  by  running  SMODELS  on  I>o(<5o)  and 
extracting  the  necessary  information  from  the  computed  answer  sets.  It  is  easy 
to  check  that,  as  expected,  there  are  three  candidate  diagnoses: 

Di  =  {{o(6rfc,0)},{6}) 

D2  =  {{o(sr5,0)},{r}) 

Dg  =  ({o{brk,  0),  o{srg,  0)},  {b,  r}) 

which  corresponds  to  our  intuition.  Theorem  1  guarantees  correctness  of  this 
computation. 


The  basic  diagnostic  module  Vq  can  be  modified  in  many  different  ways.  For  in¬ 
stance,  a  simple  modification,  which  eliminates  some  candidate  diagnoses 

containing  actions  unrelated  to  the  corresponding  symptom  can  be  constructed 
as  follows:  Let 

DMi  =  DMq  U  REL 


where 


'1.  rel{A,L) 

2.  rel{A,L) 


REL  < 


3.  rel{A) 


4. 


dJaw{D), 

head{D,L), 

action{D,  A), 

X-act{A). 

sJaw{D), 

head{D,L), 

prec{D,  P), 

reliA,P), 

x.act{A). 

obs{L,T), 

T>n, 

rel{A,  L). 

T  <n, 
o(A,T), 
not  hpd{A,T), 
not  rel{A). 
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and  let 

Vi{S)=TEST{S)UDMi 

It  is  easy  to  see  that  this  modification  is  safe,  i.e.  Vi  will  not  miss  any  useful 
predictions  about  the  malfunctioning  components.^ 

Example  3.  Let  us  expand  the  system  S  from  Example  1  by  a  new  component,  c, 
unrelated  to  the  circuit,  and  an  exogenous  action  a  which  damages  this  compo¬ 
nent.  It  is  easy  to  see  that  diagnosis  Sq  from  Example  1  will  still  be  a  symptom 
of  malfunctioning  of  a  new  system,  Sa^  and  that  the  basic  diagnostic  module 
applied  to  Sa  will  return  diagnoses  Di  —  D3  from  Example  2  together  with  new 
diagnoses  containing  a  and  ah{c),  e.g. 

D4  =  {{o(brks,0),o{a,0)},{b,c}) 

Diagnostic  module  Vi  will  ignore  actions  unrelated  to  S  and  return  only  D1—D3. 

It  may  be  worth  noticing  that  the  distinction  between  hpd  and  o  allows  actions 
unrelated  to  observations  at  n  to  actually  happen  at  moment  n  ~  1.  Constraint 
(4)  of  REL  only  prohibits  generating  such  actions  in  our  search  for  diagnosis. 
Even  more  unrelated  actions  can  be  eliminated  from  the  search  space  of  our 
diagnostic  modules  by  considering  relevance  relation  rel  depending  on  time. 
The  diagnostic  module  Vi  can  also  be  further  modified  by  limiting  its  search  to 
recent  occurrences  of  exogenous  actions.  This  can  be  done  by 

V2{S)  -  TEST{S)  U  DM2 

where  DM2  is  obtained  by  replacing  an  atom  0  <  T  <  n  in  the  bodies  of  rules 
of  DMo  by  n~m<T  <n.  The  constant  m  determines  the  time  interval  in  the 
past  that  an  agent  is  willing  to  consider  in  it’s  search  for  possible  explanations. 
To  simplify  our  discussion  in  the  rest  of  the  paper  we  assume  that  m  =  l.  Finally, 
the  rule 

•j—  k{o{A,  n  -  1)}. 

added  to  DM2  will  eliminate  all  diagnoses  containing  more  than  k  actions.  Of 
course  the  resulting  module  V3  as  well  as  T>2  can  miss  some  diagnoses  and 
deepening  of  the  search  and/or  increase  of  k  may  be  necessary  if  no  diagnosis 
of  a  symptom  is  found.  There  are  many  other  interesting  ways  of  constructing 
efficient  diagnostics  modules.  We  are  especially  intrigued  by  the  possibilities  of 
using  new  features  of  answer  sets  solvers  such  as  weight  rules  of  SMODELS  and 
soft  constraints  of  DLV  [19]  to  specify  a  preference  relation  on  diagnosis.  This 
however  is  a  subject  of  further  investigation.  Suppose  now  the  diagnostician  has 
a  candidate  diagnosis  Z)  of  a  symptom  <S.  Is  it  indeed  a  diagnosis? 

^  In  the  full  paper  we  will  make  this  and  other  similar  statements  mathematically 
precise. 
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4  Finding  a  Diagnosis 

To  answer  this  question  the  agent  should  be  able  to  test  components  of  A(D). 
Assuming  that  no  exogenous  actions  occur  during  testing  a  diagnosis  can  be 
found  by  the  following  simple  algorithm,  FindJDiag{S): 

function  FindJ)iag{S) 
repeat 

{E,  A)  Candidate  JDiag{S)] 
diag  :=  true;  Aq  :=  A] 

while  Aq  and  diag  do 

select  c  e  ^o;  \  {c}; 

if  faulty{c)  then 

On  :=  On  U  obs{ab{c),n); 
else 

On  '=  On  U  obs{->ab{c)^n); 
diag  :=  false; 

end 

end  {while} 
until  diag  or  A 
return  [E^A). 

The  algorithm  uses  functions  Candidate JDiag{S)  which  returns  a  candidate 
diagnosis  {E,  A)  of  S  and  faulty  {c)  which  checks  if  a  component  c  of  5  is  faulty. 
Notice  that  A  —  ^  indicates  that  no  diagnosis  is  found  -  the  diagnostician  failed. 
To  illustrate  the  algorithm,  consider 

Example  Consider  the  system  S  from  Example  1  and  a  history  /q  in  which  b  is 
not  protected,  all  components  of  S  are  ok,  both  switches  are  open,  and  the  agent 
closes  Si  at  time  0.  At  time  1,  he  observes  that  the  bulb  b  is  not  lit,  considers  S  = 
(/o,Oi)  where  Oi  —  {o6s(“ion(6),  1)}  and  calls  function  NeedS>iag{S)  which 
searches  for  an  answer  set  olTEST(S).  There  are  no  such  sets,  the  diagnostician 
realizes  he  has  a  symptom  to  diagnose  and  calls  function  FindJ)iag(s).  Let  us 
assume  that  the  first  call  to  Candidate  JDiag  returns 

PDi  =  ({o(srp,0)},{r,6}) 

Suppose  that  the  agent  selects  component  r  from  A  and  determines  that  it  is  not 
faulty.  Observation  o6s(-ia6(r),  1)  will  be  added  to  Oi,  diag  will  be  set  to  false 
and  the  program  will  call  Candidate  JDiag  again  with  the  updated  symptom  S 
as  a  parameter.  Candidate  JDiag  will  return  another  possible  diagnosis 

PD2  =  ({o(6rfc,0)},{6}) 

The  agent  will  test  bulb  6,  find  it  to  be  faulty,  add  observation  obs{ab{b),  1)  to  Oi 
and  return  PD2. 

Now  let  us  consider  a  different  scenario: 
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Example  5.  Let  Jo  and  observation  Oi  be  as  in  Example  4  and  suppose  that 
the  program’s  first  call  to  Candidate  JDiag  returns  PD\ ,  h  is  found  to  be  faulty, 
ohs{ah{b),l)  is  added  to  Oi,  and  Find.Diag  returns  PDi.  The  agent  proceeds 
to  have  b  repaired  but,  to  his  disappointment,  discovers  that  b  is  still  not  on! 
Intuitively  this  means  that  PDi  is  a  wrong  diagnosis  -  there  must  have  been  a 
power  surge  at  0. 

The  example  shows  that,  in  order  to  find  a  correct  explanation  of  a  symptom,  it 
is  essential  for  an  agent  to  repair  damaged  components  and  observe  the  behavior 
of  the  system  after  repair.  For  simplicity  we  assume  that,  similar  to  testing, 
repair  occurs  in  well  controlled  environment,  i.e.  no  exogenous  actions  happen 
during  the  repair  process.  To  formally  model  this  process  we  introduce  a  special 
action,  repair{c)  for  every  component  c  of  S.  The  effect  of  this  action  will  be 
defined  by  the  causal  law: 

causes{repair{c),->ab{c),  []) 

The  diagnostic  process  will  be  now  modeled  by  the  following  algorithm:  (Here 
S  =  {r„_i,  O))  and  {obs{fi,  fc)}  is  a  collection  of  observations  the  diagnostician 
makes  to  test  his  repair  at  moment  k.) 

procedure  Diagnose{S\ 
k  :=  n; 

while  NeedJDiag{S)  do 

(E,  A)  =  Find^iag{S)\ 
if  zl  =  0  then 
no  diagnosis 
else 

Repair  (A); 

O  :=  OU  {hpd{repair{c),  k)  :  c£  A}] 

0:=OU{ob8(fi,k)y, 

end 

end 

Example  6.  To  illustrate  the  above  algorithm  let  us  go  back  to  the  agent  from 
Example  5  who  just  discovered  diagnosis  Di.  He  will  repair  the  bulb  and  check 
if  the  bulb  is  lit.  It  is  not,  and  therefore  a  new  observation  is  recorded  as  follows: 

Oi  :=  Oi  U  {hpd{repair{b),  1),  obs{-->on{b),  2)} 

NeedJDiag{S)  will  detect  a  continued  need  for  diagnosis,  FindJDiag{S)  will 
return  Ds,  which,  after  new  repair  and  testing  will  hopefully  prove  to  be  the 
right  diagnosis. 

The  diagnosis  produced  by  the  above  algorithm  can  be  viewed  as  a  reason¬ 
able  interpretation  of  discrepancies  between  the  agent’s  predictions  and  actual 
observations.  To  complete  our  analysis  of  step  1  of  the  agent’s  acting  and  reason¬ 
ing  loop  we  need  to  explain  how  this  interpretation  can  be  incorporated  in  the 
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agent’s  history.  If  the  diagnosis  discovered  is  unique  then  the  answer  is  obvious 
-  O  is  simply  added  to  r„_i.  If  however  faults  of  the  system  components  can  be 
caused  by  difiFerent  sets  of  exogenous  actions  the  situation  becomes  more  subtle. 
Complete  investigation  of  the  issues  involved  is  the  subject  of  further  research. 

5  Related  Work 

There  is  a  numerous  number  of  papers  on  diagnosis  many  of  which  substantially 
influenced  the  authors  views  on  the  subject.  The  roots  of  our  approach  go  back 
to  [15]  where  diagnosis  for  a  static  environment  were  formally  defined  in  logical 
terms.  Recent  expansions  of  this  work  [17,12,3]  which  take  into  account  the 
dynamics  of  system’s  behavior  served  as  the  starting  point  of  the  work  presented 
in  this  paper.  We 

1.  substantially  simplified  the  basic  definitions  of  [3]; 

2.  presented  reasonable  efficient  and  provenly  correct  algorithms  for  computing 
‘dynamic’  diagnosis; 

3.  showed  how  to  combine  diagnostics  with  planning  and  other  activities  of  a 
reasoning  agent. 

The  simplification  of  basic  definitions  from  [3]  is  achieved  by  a  careful  choice  of  the 
history  description’  language  -  A.[i  seems  to  be  more  suitable  for  our  purposes 
that  C  used  in  [3].  The  reasoning  algorithms  are  based  on  recent  discoveries  of 
close  relationship  between  A-Prolog  and  reasoning  about  effects  of  actions  [11] 
and  the  ideas  from  answer  set  programming  [10,13,9].  This  approach  of  course 
would  be  impossible  without  existence  of  efficient  answer  set  reasoning  systems. 
Finally,  the  integration  of  a  diagnostic  and  other  activities  is  based  on  the  agent 
architecture  from  [1]. 

6  Conclusion 

The  paper  describes  an  ongoing  work  on  the  development  of  a  diagnostic  problem 
solving  agent  in  A-Prolog.  In  particular  we  are  looking  for  for  good  modeling 
techniques  with  clear  and  provenly  correct  algorithms.  The  following  can  be  of 
interest  to  people  who  share  these  interests: 

•  definitions  of  a  symptom,  candidate  diagnosis,  and  diagnosis  which  we  believe 
to  be  substantially  simpler  than  other  similar  approaches; 

•  a  new  algorithm  for  computing  candidate  diagnoses.  (The  algorithm  is  based 
on  answer  set  programming  and  views  the  search  for  candidate  diagnoses  as 
‘planning  in  the  past’); 

•  a  simple  account  of  diagnostics,  testing  and  repair  based  on  the  use  of  answer 
set  solvers. 

In  the  full  paper  we  plan  to  give  mathematical  analysis  of  correctness  of  the 
corresponding  algorithms  and  test  them  on  medium  size  examples. 
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Abstract.  In  this  paper  we  present  a  declarative  approach  to  adding 
domain-dependent  control  knowledge  for  Answer  Set  Planning  (ASP). 
Our  approach  allows  different  types  of  domain-dependent  control  knowl¬ 
edge  such  as  hierarchical,  temporal,  or  procedural  knowledge  to  be  rep¬ 
resented  and  exploited  in  parallel,  thus  combining  the  ideas  of  control 
knowledge  in  HTN-planning,  GOLOG-programming,  and  planning  with 
temporal  knowledge  into  ASP.  To  do  so,  we  view  domain-dependent  con¬ 
trol  knowledge  as  sets  of  independent  constraints.  An  advantage  of  this 
approach  is  that  domain-dependent  control  knowledge  can  be  modularly 
formalized  and  added  to  the  planning  problem  as  desired.  We  define  a  set 
of  constructs  for  constraint  representation  and  provide  a  set  of  domain- 
independent  logic  programming  rules  for  checking  constraint  satisfaction. 


1  Introduction 

Planning  is  hard.  The  complexity  of  classical  planning  is  known  to  be  PSPACE- 
complete  for  finite  domains  and  undecidable  in  the  general  case  [8,12].  By  fixing 
the  length  of  plans,  the  planning  problem  reduces  to  NP-complete  or  worse. 
Planning  systems  such  as  FF  [16],  HSP  [6],  Graphplan  [5],  and  Blackbox  [18] 
have  greatly  improved  the  performance  of  their  systems  on  benchmark  planning 
problems  by  exploiting  domain-independent  search  heuristics,  clever  encodings 
of  knowledge,  and  efficient  data  structures  [30].  Nevertheless,  despite  impres¬ 
sive  improvements  in  performance,  there  is  a  growing  belief  that  planners  that 
exploit  domain- dependent  control  knowledge  may  provide  the  key  to  future  per¬ 
formance  gains  [30].  This  conjecture  is  supported  by  the  impressive  performance 
of  planners  such  as  TLPlan  [1],  TALplan  [11]  and  SHOP  [26],  all  of  which  exploit 
domain-dependent  control  knowledge. 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  226-239,  2001. 
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A  central  issue  in  incorporating  domain-dependent  control  knowledge  into 
a  planner  is  to  identify  the  classes  of  knowledge  to  incorporate  and  to  devise 
a  means  of  representing  and  reasoning  with  this  knowledge.  In  the  past,  plan¬ 
ners  such  as  TLPlan  and  TALplan  have  exploited  domain-dependent  tempo¬ 
ral  knowledge]  SHOP  and  various  hierarchical  task  network  (HTN)  planners 
have  exploited  domain-dependent  hierarchical  and  partial- order  knowledge]  and 
satisfiability-based  planners  such  as  Blackbox  have  experimented  with  a  variety 
of  domain-dependent  control  knowledge  encoded  as  propositional  formulae.  In 
this  paper,  we  propose  to  exploit  temporal  knowledge  and  hierarchical  knowl¬ 
edge  as  well  as,  what  we  refer  to  as,  procedural  knowledge  within  the  paradigm 
of  answer  set  planning.  We  show  how  these  classes  of  domain-dependent  control 
knowledge  can  be  represented  using  a  normal  logic  program  and  how  they  can 
be  exploited  by  a  basic  answer  set  planner.  We  demonstrate  the  improvement  in 
the  efficiency  of  our  answer  set  planner. 

The  set  of  programming  language  constructs  provided  by  the  logic  program¬ 
ming  language  GOLOG  (e.g.,  sequence  (;),  if-then-else,  while,  etc.)  [20]  provides 
an  example  of  the  class  of  procedural  knowledge  we  incorporate  into  our  plan¬ 
ner.  For  example,  a  procedural  constraint  written  as  ai]  a2\  {asla^las)]  f7  tells 
the  planner  that  it  should  make  a  plan  where  ai  is  the  first  action,  a2  is  the 
second  action  and  then  it  should  choose  one  of  as,  04  or  as  such  that  after  their 
execution  /  will  be  true.  This  type  of  domain-dependent  control  knowledge  is 
different  from  tempor2il  knowledge  where  plans  are  restricted  to  action  sequences 
that  agree  with  a  given  set  of  temporal  formulas.  Procedural  knowledge  is  also 
different  from  hierarchical  and  partial-order  constraints  where  tasks  are  divided 
into  smaller  tasks,  with  some  partial  ordering  and  other  constraints  between 
them.  These  three  classes  of  domain-dependent  control  knowledge  differ  in  their 
structure  and  while  there  may  be  transformations  available  between  one  form 
and  another,  it  is  often  natural  for  a  user  to  express  knowledge  in  a  particular 
form. 

To  exploit  the  above  classes  of  domain-dependent  planning  constraints  we 
use  the  declarative  problem-solving  paradigm  exemplified  by  satisfiability-based 
planners.  We  refer  to  such  an  approach  to  planning  as  model-based  planning^  to 
indicate  that  plans  are  models  of  the  logical  theory  describing  the  planning  prob¬ 
lem.  One  advantage  of  this  approach  is  that  planner  development  is  divided  into 
two  parts:  development  of  model  generators  for  logical  languages,  and  planner 
encoding  as  a  logical  theory.  This  enables  those  developing  logical  encodings  of 
model-based  planning  problems  to  exploit  the  diversity  of  domain-independent 
model  generators  being  developed  for  different  tasks. 

In  this  paper,  we  use  an  answer  set  programming  appraoch  to  model-based 
planning.  We  use  logic  programming  as  the  logical  language  to  encode  our  model- 
based  planning  problem.  Prom  a  knowledge  representation  perspective,  there  are 
many  advantages  to  a  logic  programming  encoding,  as  compared  to  a  simple 
propositional  logic  encoding.  These  include:  p£irsimonious  encoding  of  solutions 
to  the  frame  problem  in  the  presence  of  qualification  and  ramification  constraints; 
the  presence  of  the  non-classical  operator  that  not  only  helps  in  encoding 
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causality  but  also  can  be  exploited  when  searching  for  models;  and  many  fun¬ 
damental  theoretical  results  [3]  that  help  construct  proofs  of  the  correctness  of 
encodings.  In  contract,  few  of  the  encodings  of  satisfiability-based  planners  have 
proofs  of  correctness,  while  most  logic  programming  encodings  are  accompanied 
by  a  proof  of  correctness.  Prom  the  perspective  of  computation,  planners  based 
on  propositional  encodings  still  fare  better.  There  are  currently  more  implemen¬ 
tations  of  propositional  solvers  than  of  logic  programming  answer  set  generators, 
and  the  best  propositional  solvers  tend  to  be  faster  than  the  best  answer  set  gen¬ 
erators. 

The  rest  of  this  paper  is  organized  as  follows:  we  will  review  the  basics  of 
action  language  and  answer  set  planning  in  the  next  section.  We  then  introduce 
different  constructs  for  domain-dependent  control  knowledge  representation.  For 
each  construct,  we  provide  a  set  of  logic  programming  rules  as  its  implementation 
(Subsections  3. 1-3.3).  We  use  Smodels,  an  implemented  system  for  computing 
stable  models  of  logic  programs  [27],  in  our  experiments.  As  such,  the  rules 
developed  in  this  paper  are  written  in  Smodels  syntax  and  can  be  used  as  input 
to  Smodels  program^.  In  Subsection  3.4,  we  describe  some  experimental  results 
and  conclude  in  Section  4. 


2  Preliminaries 

2.1  Action  Theories 

We  use  the  high-level  action  description  language  B  of  [15]  to  represent  action 
theories.  In  such  a  language,  an  action  theory  consists  of  two  finite,  disjoint  sets 
of  names  called  actions  and  fluents.  Actions  transition  the  system  from  one  state 
to  another.  Fluents  are  propositions  whose  truth  value  can  change  as  the  result 
of  actions.  Unless  otherwise  stated,  a  is  used  to  denote  an  action.  /  and  p  are 
used  to  denote  fluents.  The  action  theory  also  comprises  a  set  of  propositions  of 
the  following  form: 


caused({pi,...,p„},/) 

(1) 

causes(a,  /,  {pi, . . .  ,Pn}) 

(2) 

executable(a,  {pi, . . .  ,pn}) 

(3) 

initially(/) 

(4) 

where  /  and  pi’s  are  fluent  literals  (a  fluent  literal  is  either  a  fluent  g  or  its 
negation  written  as  neg{g))  and  a  is  an  action.  (1)  represents  a  static  causal 
law,  i.e.,  a  ramification  constraint.  It  conveys  that  whenever  the  fluent  liter¬ 
als  pi, ...  ,pn  hold,  so  does  /.  (2),  referred  to  as  a  dynamic  causal  law,  repre¬ 
sents  the  (conditional)  effect  of  a.  Intuitively,  a  proposition  of  the  form  (2)  states 
that  /  is  guaranteed  to  be  true  after  the  execution  of  a  in  any  state  of  the  world 
where  pi,...,pn  are  true.  (3)  captures  an  executability  condition  of  a.  It  says 
that  a  is  executable  in  a  state  in  which  pi, . . .  ,Pn  hold.  Finally,  propositions  of 

^  Although  we  use  Smodels,  we  believe  that  the  code  presented  here  could  easily  be 
used  with  DLV  [9],  following  simple  modifications  to  reflect  differences  in  syntax. 
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the  form  (4)  are  used  to  describe  the  initial  state.  (4)  states  that  /  holds  in  the 
initial  state. 

An  action  theory  is  a  pair  (J9,  F)  where  D  consists  of  propositions  of  the 
form  (l)-(3)  and  F  consists  of  propositions  of  the  form  (4).  For  the  purpose 
of  this  paper,  it  suffices  to  note  that  the  semantics  of  such  an  action  theory  is 
given  by  a  transition  graph,  represented  by  a  relation  t,  whose  nodes  are  the 
alternative  (complete)  states  of  the  action  theory  and  whose  links  (labeled  with 
actions)  represent  the  transition  between  its  states  (see  details  in  [15]).  That  is, 
if  (s,  a,  s')  €  t,  then  there  exists  a  link  with  label  a  from  state  s  to  state  s'. 

A  trajectory  of  the  system  is  denoted  by  a  sequence  soo-isi . . .  OnSn  where  s^’s 
are  states  and  a^’s  are  actions  and  (si,ai+i,Si+i)  G  t  for  i  €  {0, ...,n  — 
1}.  . . .  ttnSn  is  a  trajectory  of  a  fluent  formula  Ai^  A  holds  in 

In  this  paper,  we  will  assume  that  F  is  complete^  i.e.,  for  every  fluent  /,  either 
initially(/)  or  initially(nep(/))  belongs  to  F.  We  will  also  assume  that  (D,  F) 
is  consistent  in  the  sense  that  there  exists  a  non-empty  relation  t  representing 
the  transition  graph  of  (D^F). 

2.2  Answer  Set  Planning 

A  planning  problem  is  specified  by  a  triple  {D^F,A}  where  (D^F)  is  an  ac¬ 
tion  theory  and  is  a  fluent  formula  (or  goat),  representing  the  goal  state.  A 
sequence  of  actions  ai , . . . ,  a-m  is  a  possible  plan  for  A  if  there  exists  a  trajec¬ 
tory  50^1  Si  -  •  •  f^mSm  such  that  So  and  Sm  satisfy  F  and  A,  respectively^. 

Given  a  planning  problem  {D,F,A),  answer  set  planning  solves  it  by  trans¬ 
lating  it  into  a  logic  program  77(D,  F,  A)  (or  Ft,  for  short)  consisting  of  domain- 
dependent  rules  that  describe  D,  F,  and  A  respectively,  and  domain-independent 
rules  that  generate  action  occurrences  and  represent  the  transitions  between 
states. 

•  Goal  representation.  To  encode  A,  we  define  formulas  and  provide  a  set 
of  rules  for  formula  evaluation.  We  consider  formulas  that  are  bounded  classical 
formulas  with  each  bound  variable  associated  with  a  sort.  They  are  formally 
defined  as  follows. 

—  A  literal  is  a  formula. 

—  The  negation  of  a  formula  is  a  formula. 

—  A  finite  conjunction  of  formulas  is  a  formula. 

—  A  finite  disjunction  of  formulas  is  a  formula. 

—  If  Xi,...,  Xn  are  variables  that  can  have  values  from  the  sorts  si, . . . ,  Sn? 
and  fi{Xi, . . . ,  Xn)  is  a  formula  then  VXi, . . . ,  XnJi{Xi, . . . ,  X„)  is  a  for¬ 
mula. 

^  Note  that  the  notion  of  plan  employed  here  is  weaker  than  the  conventional  one 
where  the  goal  must  be  achieved  on  every  possible  trajectory.  This  is  because  an 
action  theory  with  causal  laws  can  be  non-deterministic.  Note  however,  that  if  D 
is  deterministic,  i.e,,  for  every  pair  (s,a)  there  exists  at  most  one  state  s'  such  that 
(5,  a,  s')  €  t,  then  every  possible  plan  for  A  is  also  a  plan  for  A. 
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If  Xi, . . . ,  Xn  are  variables  that  can  have  values  from  the  sorts  si, . . . ,  s„, 
and  fiiXi, . .  .,X„)  is  a  formula  then  3Xi, . . .  ,X„./i(Xi, . . .  is  a  for- 
mula. 

A  sort  called  formula  is  introduced  and  each  non-atomic  formula  is  associated 
with  a  unique  name  and  defined  by  (possibly)  a  set  of  rules.  For  example,  the  con¬ 
junction  fAgAhis  represented  by  the  set  of  atoms  {conj{f'),  m(/,  /'),  in{g,  f), 
in{h,  /')}  where  f  is  the  name  assigned  to  fAgAh-,  VXj, . . . ,  Xn.fi{Xi, . . . , 
can  be  represented  by  the  rule 

formula{forall{f,  /i(Ai, . . . ,  X^)))  ^  in{Xu  si), . . . ,  in{Xn,  Sn) 

where  /  is  the  name  assigned  to  the  formula.  In  keeping  with  previous  notation, 
negation  is  denoted  by  the  function  symbol  neg.  For  example,  if  /  is  the  name  of 
a  formula  then  neg{f)  is  a  formula  denoting  its  negation.  Rules  to  check  when  a 
formula  holds  or  does  not  hold  can  be  written  in  a  straightforward  manner  and 
are  omitted  here  to  save  space.  (Details  can  be  downloaded  from  the  Web^.) 

•  Action  theory  representation.  Since  each  set  of  literals  {pi,...,pn}  in 
(l)-(3)  can  be  represented  by  a  conjunction  of  literals,  D  can  be  encoded  as  a 
set  of  facts  of  iJ  as  follows.  First,  we  assign  to  each  set  of  fiuent  literals  that 
occurs  in  a  proposition  of  D  a  distinguished  name.  The  constant  nil  denotes  the 
set  {}.  A  set  of  literals  {pi, . . .  ,p^}  will  be  replaced  by  the  set  of  atoms  Y  = 
{conj{s),  in{pi,  s),. . .  ,m(p  71,  s)}  where  s  is  the  name  assigned  to  "(pi, . . .  jPn}' 
With  this  representation,  propositions  in  D  can  be  easily  translated  into  a  set 
of  facts  of  77.  For  example,  a  proposition  causes{a,  /,  {pi, . . .  ,pn})  with  n>  0  is 
encoded  as  a  set  of  atoms  consisting  of  causes{a,  f,s)  and  the  set  Y  {s  is  the 
name  assigned  to  {pi, . . .  ,p^}). 

•  Domain  independent  rules.  The  domain  independent  rules  of  77  are  adapted 
mainly  from  [14,10,21,22].  The  main  predicates  in  these  rules  are: 

—  holds{L^T):  L  holds  at  time  T, 

—  possible{A^T):  action  A  is  executable  at  time  T, 

-  occ(A,  T):  action  A  occurs  at  time  T,  and 

-  hf{(p,T):  formula  (p  holds  at  time  T. 

The  main  rules  are  given  next.  In  these  rules,  T  is  a  variable  of  the  sort  time,  L,  G 
are  variables  denoting /Zueni  literals  (written  as  T’  or  neg(F)  for  some  fluent  F),  S 
is  a  variable  set  of  the  sort  conj  (conjunction),  and  A,  B  are  variables  of  the  sort 


action. 

holds{L,T+l)  <r~  occ{A,  T),  causes{A,  7, 5),  hf{S,  T).  (5) 

holds  {L,  T)  <—  caused{S,  7),  hf{S,  T).  (6) 

holds{L,T-\~l)  <-  contrary{L,  G),  holds{L,  T),not  holds{G,  T+1).  (7) 

possihle{A, T)  ^  executahle{A,  5),  hf{S,  T).  (8) 

holds{L,0)  liter al{L),  initially (L).  (9) 

nocc{A,  T)^  A  occ{B,  T),  T<length.  (10) 

occ{A,T)  <r-T  <  length,  possible{A,T),  not  nocc{A,T).  (11) 

http:/ /www.cs.nmsu.edu/ '^tson /asp.planner 
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Here,  (5)  encodes  the  effects  of  actions,  (6)  encodes  the  effects  of  static  causal 
laws,  and  (7)  is  the  inertial  rule.  (8)  defines  a  predicate  that  determines  when  an 
action  can  occur  and  (9)  encodes  the  initial  situation.  (lO)-(ll)  generate  action 
occurrences,  one  at  a  time.  We  omit  most  of  the  auxiliary  rules  such  as  rules 
for  defining  contradictory  literals  etc.  The  source  code  and  examples  can  be 
retrieved  from  our  Web  site. 

Let  IlniD,  r.  A)  (or  iT„  when  it  is  clear  from  the  context  what  D,  T,  and 
are)  be  the  logic  program  consisting  of 

-  the  set  of  domain-independent  rules  in  which  the  domain  of  T  is  {0, . . . ,  n}, 

-  the  set  of  atoms  encoding  D  and  T,  and 

-  the  rule  ^  not  hf{A,  n)  that  encodes  the  requirement  that  A  holds  at  n. 

The  following  result  (adapted  from  [22])  shows  the  equivalence  between  trajec¬ 
tories  of  A  and  stable  models  of  Iln-  Let  5  be  a  stable  model  of  Tin,  define 
s{i)  =  {/  I  holds{f,i)  £  S}  and  A[iJ]  =  a^, . . .  ,aj  where  i  or  j  are  integers,  / 
is  a  fluent,  ats  are  actions,  and  for  every  occ{at,t)  e  S. 

Theorem  1.  For  a  planning  problem  (D,  T,  A), 

-  if  soao . . .  Qn-iSn  ^  a  trajectory  of  A,  then  there  exists  a  stable  model  S  of 
JJn  such  that  A[0,n— 1]  =  [ao, .  • .  ,an-i]  Si  =  s(i)  for  i  6  {0,  ...,n}, 
and 

-  if  S  is  a  stable  model  of  Tin  with  A[0,  n— 1]  =  [ao, . .  ■ ,  Un-i]  s(0)ao . . . 
an-i5(n)  is  a  trajectory  of  A. 


3  Control  Knowledge  as  Constraints 

In  this  section,  we  add  domain-dependent  control  knowledge  to  ASP  by  viewing 
it  as  constraints  on  the  stable  models  of  the  program  JJ .  For  each  type  of  control 
knowledge^,  we  introduce  new  constructs  for  its  encoding  and  present  a  set  of 
rules  that  check  when  a  constraint  is  satisfied. 


3.1  Temporal  Knowledge 

In  [1],  temporal  knowledge  is  used  to  prune  the  search  space.  Temporal  con¬ 
straints  are  specified  using  a  linear  temporal  logic  with  a  precisely  defined  se¬ 
mantics.  It  is  easy  to  add  them  to  (or  remove  them  from)  a  planning  problem 
since  their  representation  is  separate  from  the  action  and  goal  representation. 
Planners  exploiting  temporal  knowledge  to  control  search  have  proven  to  be 
highly  efficient  and  to  scale  up  well  [2].  In  this  paper,  we  represent  temporal 
knowledge  using  temporal  formulas.  In  our  notation,  a  temporal  formula  is  ei¬ 
ther 


^  We  henceforth  abbreviate  domain-dependent  control  knowledge  as  control  knowl¬ 
edge. 
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—  a  formula  (as  defined  in  previous  section),  or 

—  a  formula  of  the  form  until{(f>,'tp),  always{(l>),  eventually or  next{(j)) 

where  (j)  and  'ip  are  temporal  formulas. 

For  example,  in  a  logistics  domain,  let  P  and  L  denote  a  package  and  its 
location,  respectively.  The  following  formula: 

always{{goal{P,  L)  A  at{P,  L))  next{-^holding{P)))  (12) 

can  be  used  to  express  that  if  the  goal  is  to  have  a  package  at  a  particular 
location  and  if  the  package  is  indeed  at  that  location  then  it’s  always  the  case 
that  the  agent  will  not  be  holding  the  package  in  the  next  state.  This  has  the 
effect  of  preventing  the  agent  from  picking  up  the  package  once  it’s  at  its  goal 
location. 

Like  non-atomic  formulas,  temporal  formulas  can  be  encoded  in  ASP  using 
constants,  atoms,  and  rules.  For  example,  the  formula  until{f^next{g))  is  repre¬ 
sented  by  the  set  of  atoms  {tf{ni,next{g))^tf{n2^until{f^ni))}  where  i/ stands 
for  “temporal  formula”  and  ni  and  n2  are  the  new  constants  assigned  to  next{g) 
and  until{f,neg{g)),  respectively.  The  semantics  of  these  temporal  operators  is 
the  standard  one. 

To  complete  the  encoding  of  temporal  constraints,  we  provide  the  rules  for 
temporal  formula  evaluation.  The  key  rules,  which  define  the  satisfiability  of  a 
temporal  formula  N  at  time  T  {htf{N,  T))  and  between  T  and  T'  {hd{N,  r,T')), 


are  given  below. 

htf  (TV,  T)  •<-  formula{N) ,  hf{N,  T)  (13) 

hf{N,T)  ^  tf{N,Ni),htf{Ni,T)  (14) 

htf{N,T)  ^  tf{N,until{NuN2)),hd{Ni,TX),htf{N2,T'y  (15) 

htf{N,  T)  ^  tf{N,  always(Ni)),  hd{Ni,T,  length-\-l).  (16) 

htfiN,  T)  <-  tf{N,  eventually (Ni)),  htfiNi,  T'),T  <  T^  (17) 

htf(N,T)  tf{N,next{Ni)),htf(Ni,T-hl).  (18) 

notJid{N,  T,  T')  ^  not  htf{N,  T”),T<T"<T'.  (19) 

hd{N,  T,  T')  ^  htf{N,  T),  not  notJid{N,  T,  T')  (20) 


Having  defined  temporal  constraints  and  specified  when  they  are  satisfied, 
adding  temporal  knowledge  to  a  planning  problem  in  ASP  is  easy.  We  must:  (i) 
encode  the  knowledge  as  a  temporal  formula,  say  0;  (ii)  add  the  rules  (13)- (20)  to 
i7;  and  (iii)  add  the  constraint  ^  not  htf{(j)^  0)  to  77.  Step  (iii)  eliminates  models 
of  n  in  which  <j)  does  not  hold.  For  example,  if  77  is  the  program  for  planning 
in  the  logistics  domain,  adding  the  constraint  (12)  to  77  will  eliminate  all  mod¬ 
els  whose  corresponding  trajectory  admits  an  action  occurrence  that  causes  the 
holding[P)  to  be  true  after  P  is  delivered  at  its  destination.  As  a  concrete  exam¬ 
ple,  given  the  goal  formula  at(jp,  ^2)5  there  exists  no  model  of  77  that  corresponds 
to  the  sequence  of  actions  pzcfc_wp(p, /i),mo2;e(^i,/2),  drop{p^l2),pickjup{pjl2). 
(We  appeal  to  the  users  for  the  intuitive  meaning  of  the  effects  of  actions,  the 
initial  setting,  and  the  goal  of  the  problem.) 
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3.2  Procedural  Knowledge 


Procedural  knowledge  can  be  thought  of  as  an  (under-specified)  sketch  of  the 
plans  to  be  generated.  This  type  of  control  knowledge  has  been  used  in  GOLOG, 
an  Algol-like  logic  programming  language  for  agent  programming,  control  and 
execution,  based  on  a  situation  calculus  theory  of  actions  [20].  GOLOG  has 
been  primarily  used  as  a  programming  language  for  high-level  agent  control  in 
dynamical  environments  (see  e.g.  [7]).  More  recently,  Golog  has  been  used  for 
general  planning  [13].  In  the  planning  context,  a  GOLOG  program  specifies  an 
arbitrarily  incomplete  plan  that  includes  non-deterministic  choice  points  that 
are  filled  in  by  the  planner  (the  deductive  machinery  of  a  GOLOG-interpreter). 
For  example,  a  simple  GOLOG  program  ai;  a2;  (a3|a4|a5);  f?  represents  plans 
which  have  ai  followed  by  a2,  followed  by  one  of  as,  ^4,  or  as  such  that  /  is  true 
upon  termination  of  the  plan.  The  interpreter,  when  asked  for  a  solution  to  this 
program,  needs  only  to  decide  which  one  of  as,  a4,  or  as  it  should  choose.  To 
encode  procedural  knowledge,  we  introduce  a  set  of  Algol-like  constructs  such  as 
sequence,  loop,  conditional,  and  nondeterministic  choice  of  arguments/actions. 
These  constructs  are  used  to  encode  partial  procedural  control  knowledge  in  the 
form  of  programs  which  are  defined  inductively  as  follows.  For  an  action  theory 
(I),  r)  we  define  a  program  syntactically  as  follows. 


—  an  action  a  is  a  program, 

—  a  formula  <j)  is  a.  program®, 

—  if  Pi's  are  programs  then  pi; .  -  -  ;pn  is  a  program, 

—  if  pi's  are  programs  then  pi| . . .  |pn  is  a  program, 

—  if  Pi  and  p2  are  programs  and  (j)  is  a  formula  then 
is  a  program. 


—  if  p  is  a  program  and  (/>  is  a  formula  then  “while 


“if  (j)  then  pi 
(j)  do  p"  is  a 


else  P2” 
program, 


and 

—  if  A  is  a  variable  of  sort  s,  p{X)  is  a  program,  and  f{X)  is  a  formula,  then 
pick(A,/(A),p(A))  is  a  program. 


As  is  common  practice  with  Smodels,  we  will  assign  to  each  program  a  name 
(with  the  exception  of  actions  and  formulas),  provide  rules  for  the  construc¬ 
tion  of  programs,  and  use  prefix  notation.  A  sequence  a  =  Pi;  ♦  •  •  ;Pn  will  be 
represented  by  the  atoms  proc(p),  head{p^ni),  tail(p^n2)  and  the  set  of  atoms 
representing  P2]  ■  ■  ■  \Pn<,  where  p,  ni,  and  712  are  the  names  assigned  to  o,  pi  (if 
it  is  not  a  primitive  action  or  a  formula),  and  P2;  •  •  •  ;Pn,  respectively. 

The  operational  semantics  of  programs  specifies  when  a  trajectory  •  •  • 

an-i^n,  denoted  by  a,  is  a  trace  of  a  program  p  and  is  defined  as  follows. 


-  for  p  =  a  and  a  is  an  action,  n  =  1  and  ao  =  a, 

-  for  p  =  <^,  n  —  0  and  0  holds  in  so, 

-  for  p  =  pi;p2,  there  exists  an  i  such  that  soao . . .  Si  is  a  trace  of  pi  and 
SiOi . . .  Sn  is  a  trace  of  p2, 

®  This  is  analogous  to  the  GOLOG  test  action  /?  which  tests  the  truth  value  of  a 
fluent. 
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-  for  p  =  pi| . . .  |p„,  a  is  a  trace  of  pi  for  some  i  E  {1, . . . ,  n}, 

“  for  p  =  if  (j)  then  pi  else  p2,  a  is  a  trace  of  pi  if  (j)  holds  in  sq  or  a  is  a 

trace  ofp2  if  nep((^)  holds  in  sq, 

-  for  p  —  while  do  p^,  n  =  0  and  neg{(f))  holds  in  sq  or  (j)  holds  in  sq  and 

there  exists  some  i  such  that  Soao  ...si  is  a  trace  of  pi  and  SjUi ...  Sn  is  a 

trace  of  p,  and 

-  for  p  =  pick(X,  /(X),  q'(X)),  then  there  exists  a  constant  x  of  the  sort  of  X 
such  that  f{x)  holds  in  sq  and  a  is  a  trace  of  g(x). 

The  logic  programming  rules  that  realize  this  semantics  follow.  We  define  a 
predicate  trans{p,ti,t2)  which  holds  in  a  stable  model  S  iff  s{ti)at^  . .  .5(^2)  is 
a  trace  of  p®. 


trans{P,Ti,T2)  proc{P),  head{P,Pi),  tail{P,  P2),  (21) 

tran5(Pi,  Ti,  T3),  trans(P2,  Ts,  ^2). 

trans{A,  T,  T  +  1)  action{A),  A^nUll,  occ(A,  T).  (22) 

trans{null,T,T)  •«—  (23) 

trans{N,Ti^T2)  choice Action{N),  (24) 

in{Pi ,  N) ,  trans{Pi ,  Ti ,  T2) . 

trans{F,  Ti,Ti)  <—  formula{F),  hf{F^  Ti).  (25) 

trans{I,Ti,T2)  ^  if{I,F,Pi,P2),  (26) 

/i/  (P,  Ti ) ,  trans  (Pi ,  Ti ,  T2  ) . 

trans{I,  Ti ,  T2)  ^  if  {I,  P,  Pi,  P2),  (27) 

not  hf{F, Ti), trans (P2, Ti, T2). 

trans(W, ri,r2)  ^  n;Me(W;P,P),/i/(P,Ti),Ti  <  T3  <  T2,  (28) 

trans(P,Ti,T3),trans(W,T3,T2). 

trans{W,  T,  T)  +-  while{W,  P,  P),  not  hf{F,  T).  (29) 

trans {S,  Ti ,  T2)  ^  choiceArgs(S,  Fy  P),  (30) 

/i/(P,  Ti),  trans(P,  Ti ,  T2). 


Finding  a  valid  instantiation  of  a  program  P  can  be  viewed  as  a  planning 
problem  {D,  P,  A)  where  A  is  the  constraint  not  trans{P,  0,  n).  Let  be  the 
program  obtained  fi-om  Iln  by  (i)  adding  the  rules  (21)-(30),  and  (ii)  replacing 
the  goal  constraint  with  not  trans(P,  0,n).  The  following  theorem  is  similar 
to  Theorem  1. 

Theorem  2.  Let  [D^F)  he  an  action  theory  and  P  be  a  program.  Then,  (i) 
for  every  stable  model  S  ofU^,  s{0)ao . . .  an-is{n)  is  a  trace  of  P;  and  (ii)  if 
Soao . . .  an-iSn  is  a  trace  of  P  then  there  exists  a  stable  model  S  of  such 
that  Sj  =  s{j)  and  occ{ai,  i)  e  S  for  j  €  {0, . . ,  ,n}  and  i  e  {0, . . .  ,n  -  1}. 

®  Recall  that  we  define  s(t)  =  {holds{f,  i)  €  5  |  /  is  a  fluent}  and  assume  occ(at,  i)  E 
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3.3  HTN  Knowledge 

GOLOG  programs  are  good  for  representing  procedural  knowledge  but  prove 
cumbersome  for  encoding  partial  orderings  between  programs  and  do  not  allow 
temporal  constraints.  For  example,  to  represent  that  any  sequence  containing 
the  n  programs  in  which  pi  occurs  before  p2,  is  a  valid  plan  for 

a  goal  A,  one  would  need  to  list  all  the  possible  sequences  and  then  use  the 
non-deterministic  construct^.  This  can  be  easily  represented  by  an  HTN  con¬ 
sisting  of  the  set  {pi, . . .  ,Pn}  and  a  constraint  expressing  that  pi  must  occur 
before  p2.  HTNs  also  allows  maintenance  constraints  of  the  form  always{(j))  to 
be  represented.  However,  HTNs  do  not  have  complex  constructs  such  as  proce¬ 
dures,  conditionals,  or  loops.  Attempts  to  combine  hierarchical  constraints  and 
GOLOG-like  programs  (e.g.,  [4])  have  fallen  short  since  they  do  not  allow  com¬ 
plex  programs  to  occur  within  these  HTN  programs.  We  will  show  next  that, 
under  the  ASP  framework,  this  restriction  can  be  eliminated  by  adding  the  fol¬ 
lowing  item  to  the  definition  of  programs  in  the  previous  section. 

—  If  pi,...,Pn  are  programs  then  a  pair  (S', C)  is  a  program  where  S  = 
{pij  ■  •  •  jPn}  and  C  is  a  set  of  ordering  or  truth  constraints  (defined  below). 

Let  S  —  be  a  set  of  programs.  Assume  that  nj,  1  <  t  <  fc,  is 

the  name  assigned  to  the  program  pi.  An  ordering  constraint  over  S  has  the 
form  rii  ^  rij  where  ni  ^  rij  and  a  truth  constraint  is  of  the  form  (n^,  (j)),  (^,  ni), 
or  (rii,  0,  nt)  where  0  is  a  formula.  In  our  encoding,  we  will  represent  a  program 
(S,  C)  by  an  atom  htn{p,  Sn,  Cn)  where  p,  Sn,  and  Cn  are  the  names  assigned 
to  (S,  C),  S,  and  C  respectively.  To  complete  our  extension,  we  need  to  define 
when  a  trajectory  is  a  trace  of  a  program  with  the  new  construct  and  provide 
logic  program  rules  for  checking  its  satisfaction.  A  trajectory  Sqg^o  •  •  •  is  ^ 

trace  of  a  program  (S',  C)  if  there  exists  a  sequence  jo=0  <  ji  <  •  ■  •  <  jk='n  and 
a  permutation  (ii, . . . ,  ik)  of  (1, . . . ,  fc)  such  that  the  sequence  of  trajectories 
ai  ~  soao...Sji,  0:2  =  ak  =  . . .  Sn  satisfies  the 

following  conditions: 

—  for  each  Z,  1  <  /  <  fc,  a/  is  a  trace  ofpi^, 

“  lint  <ni  e  C  then  it  <  iu 

-  if  {<l>^ni)  €  C  (or  {ni,4>)  ^  C)  then  holds  in  the  state  Sj^_^  (or  and 

-  if  (nt,  (^,  ni)  €  C  then  (j)  holds  in  , . . . ,  Sj^_^ . 

We  will  extend  the  predicate  trans  to  allow  the  new  type  of  programs  to  be 
considered.  Rules  for  checking  the  satisfaction  of  a  program  htn{NiS,C)  are 
given  next. 

trans{N,TuT2)  ^  htn{N,  5,  C),  (31) 

not  nofc(iV,Ti,T2). 

l{6e5m(iV,/,r3,Ti,T2)  :  between(T3,TuT2)}l  ^  htn{N,S,C),in{I,S),  (32) 

For  n  =  3,  the  three  possibilities  are pi;p2;P3,  Pi;P3;P2,  and  P3;pi;P2-  Using  a  con¬ 
current  construct  1|,  these  three  programs  can  be  packed  into  two  programs  pi;p2|jp3 
and  pi;p3;P2. 
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trans{N^Ti,T2). 

l{end{N,  I, T3, TuT^)  :  between{T3, Ti, T2)}1  <-  htn{N,  S,  C),  (33) 

in{I,S), 

trans{N,TijT2). 

nok{N,  Ti ,  T2)  ^  htn{N,  S,  C),  (34) 

in{I,S),n>T4, 
begin(N,  I,T3,Ti,T2), 
end{N,  I,Ta,T\,T2), 

nok{N,  Ti,  Tz)  ^  htn{N,  S,  C),  (35) 

m(/,5),T3  <T4, 
6e5m(iV,7,r3,Ti,T2), 
end{NJ,T4,Ti,T2), 
not  trans{IjTz,  T4). 

nok{N,  Ti,  Tz)  ^  htn{N,  S,  C),  (36) 

not  trans{N,  Ti ,  Tz) . 


In  the  above  rules,  the  predicates  begin{NJ,Tz,Ti,T2)  and  end{NJ,T4, 
^ij  ^2)  used  to  record  the  beginning  and  the  end  of  the  program  /,  a  member 
of  N .  Rules  (32)-(33)  make  sure  that  each  program  will  have  start  and  times. 
These  two  rules  are  not  logic  programming  rules  but  are  unique  to  Smodels 
encodings.  They  were  introduced  to  simplify  the  encoding  of  choice  rules  [28], 
and  can  be  translated  into  a  set  of  normal  logic  program  rules.  The  predicate 
nofc(iNr,  Ti,  T2)  states  that  the  assignments  for  programs  are  not  acceptable.  (We 
omit  the  rules  that  check  for  the  satisfiability  of  constraints  in  C  of  a  program 
htn{N,S,  C).  They  can  be  downloaded  from  our  Web  site.)  Theorem  2  will  still 
hold. 


3.4  Demonstration  Experiments 

We  tested  our  implementation  with  some  domains  from  the  general  planning 
literature  and  from  the  AIPS  planning  competition  [2].  We  chose  problems  for 
which  procedural  control  knowledge  appeared  to  be  easier  to  exploit  than  other 
types  of  control  knowledge.  Our  motivation  was:  (i)  it  has  already  been  estab- 
hshed  that  well-chosen  temporal  and  hierarchical  constraints  will  improve  a  plan¬ 
ner’s  efficiency;  (ii)  we  have  previously  experimented  with  the  use  of  temporal 
knowledge  in  the  ASP  framework  [29];  and  (iii)  we  are  not  aware  of  any  empiri¬ 
cal  results  indicating  the  utility  of  procedural  knowledge  in  planning,  especially 
in  ASP.  ([13]  concentrates  on  using  GOLOG  to  do  planning  in  domains  with 
incomplete  information,  not  on  exploiting  procedural  knowledge  in  planning.) 

We  selected  the  elevator  example  from  [20]  (elpl-elp3)  and  the  Miconic-10 
elevator  domain  (si-0,. . .  ,s5-0s2),  proposed  by  Schindler  Lifts  Ltd.  for  the  AIPS 
2000  competition  [2].  Note  that  some  of  the  planners,  that  competed  in  AIPS 
2000,  were  unable  to  solve  this  problem.  Due  to  the  space  limitation  we  cannot 
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present  the  action  theories  and  the  Smodels  encoding  of  the  programs  here.  They 
can  be  found  at  the  URL  mentioned  previously.  The  time  taken  to  compute  one 
model  with  and  without  control  knowledge  are  given  in  column  5  and  6  of  the 
table  below,  respectively. 


Problem 

Plan 

Length 

#  Person 

#  Floors 

With  Control 
Knowledge 

Without  Control 
Knowledge 

elpl 

■■ 

2 

6 

0.600 

0.560 

elp2 

IB 

3 

6 

1.411 

6.729 

elp3 

mm 

4 

6 

3.224 

120.693 

4 

1 

2 

0.020 

8 

2 

4 

0.921 

12 

3 

6 

22.682 

34.519 

15 

4 

8 

164.055 

314.101 

s5-0sl 

19 

5 

4 

57.952 

>  2  hours 

19 

5 

5 

105.040 

>  2  hours 

As  can  be  seen,  the  encoding  with  control  knowledge  yields  substantially 
better  performance  in  situations  where  the  minimal  plan  length  is  great.  For  large 
instances  (the  last  two  rows),  Smodels  can  find  a  plan  using  control  knowledge 
in  a  short  time  and  cannot  find  a  plan  in  2  hours  without  control  knowledge. 
In  some  small  instances  (the  time  in  column  6  is  in  boldface),  the  speed  up 
cannot  make  up  for  the  overhead  needed  in  grounding  the  control  knowledge. 
The  output  of  Smodels  for  each  run  is  given  in  the  file  result  at  the  above  URL. 
For  larger  instances  of  the  elevator  domain  [2]  (5  persons  or  more  and  10  floors  or 
more) ,  our  implementation  terminated  prematurely  with  either  a  stack  overflow 
error  or  a  segmentation  fault  error®. 

4  Discussions  and  Future  Work 

In  this  paper  we  presented  a  declarative  approach  to  adding  domain-dependent 
control  knowledge  to  ASP.  Our  approach  enables  different  types  of  control  knowl¬ 
edge  such  as  hierarchical,  temporal,  or  procedural  knowledge  to  be  represented 
and  exploited  in  parallel;  thus  combining  the  ideas  of  HTN-planning,  GOLOG- 
programming,  and  planning  with  temporal  knowledge  into  ASP.  For  exam¬ 
ple,  one  can  find  a  valid  instantiation  of  a  GOLOG  program  that  satisfies 
some  temporal  constraints.  This  distinguishes  our  work  from  other  related  work 
[17,19,4,25]  where  only  one  or  two  types  of  constraints  were  considered  or  com¬ 
bined.  Moreover,  in  a  propositional  environment,  ASP  with  procedural  knowl¬ 
edge  can  be  viewed  as  an  off-line  interpreter  for  a  GOLOG  program.  Because  of 
the  declarative  nature  of  logic  programming  the  correctness  of  this  interpreter  is 
easier  to  prove  than  an  interpreter  written  in  Prolog.  We  view  domain-dependent 


Experiments  were  run  on  a  an  HP  OmniBook  6000  laptop  with  130,544  Kb  Ram 
and  an  Intel  Pentium  III  600  MHz  processor). 


238  Tran  Cao  Son  et  al. 


control  knowledge  as  independent  sets  of  constraints.  An  advantage  of  this  ap¬ 
proach  is  that  domain-dependent  control  knowledge  can  be  modularly  formalized 
and  added  to  planning  problems  as  desired. 

Our  experimental  result  demonstrates  that  ASP  can  scale  up  better  with 
domain-dependent  control  knowledge.  In  keeping  with  the  experience  of  re¬ 
searchers  who  have  incorporated  control  knowledge  into  SATplan  (e.g.,  [19]), 
we  do  not  expect  ASP  with  only  one  type  of  domain-dependent  knowledge  to 
do  better  than  TLPLAN  [1],  as  Smodels  is  a  general  purpose  system.  But  in 
the  presence  of  near  deterministic  procedural  constraints,  our  approach  may  do 
better.  More  rigorous  experimentation  with  a  variety  of  domains  including  those 
used  in  the  AJPS  planning  competition  will  be  a  significant  focus  of  our  future 
work. 
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Abstract.  We  investigate  the  relationship  amongst  some  solutions  to 
the  frame  problem.  We  encode  Pednault’s  syntax- based  solution  [20], 
Baker’s  state-minimization  policy  [1],  and  Gelfond  &  Lifchitz’s  Action 
Language  A  [7]  in  the  propositional  dynamic  logic  (PDL).  The  formal 
relationships  among  these  solutions  are  given.  The  results  of  the  paper 
show  that  dynamic  logic,  as  one  of  the  formalisms  for  reasoning  about 
dynamic  domains,  can  be  used  as  a  formal  tool  for  comparing  and  uni¬ 
fying  logics  of  action. 

Keywords:  relationships  between  formalisms,  frame  problem,  dynamic 
logic. 


1  Introduction 

Among  the  established  formalisms  for  specifying  and  reasoning  about  actions 
are  the  situation  calculus  [19,23],  STRIPS  [3],  the  event  calculus  [17],  action 
languages  [7]  and  some  other  monotonic  or  nonmonotonic  logics  such  as  in  [9]. 
Fundamental  problems  in  this  area,  such  as  the  frame  problem,  ramification 
problem,  and  qualification  problem,  have  been  widely  investigated  with  varying 
degrees  of  success.  Clearly,  the  time  has  come  to  analyze,  compare  and  system¬ 
atize  these  formalisms  and  solutions  in  order  to  obtain  a  more  complete  and 
unified  (if  possible)  theory  of  action. 

This  paper  focuses  on  solutions  to  the  frame  problem.  We  compare  and  ana¬ 
lyze  the  main  solutions  to  the  frame  problem  in  the  literature  by  encoding  them 
in  the  propositional  dynamic  logic  {PDL).  The  reasons  for  choosing  PDL  as 
the  medium  are  twofold.  First,  the  language  of  dynamic  logic  is  expressive.  It 
provides  built-in  expression  of  compound  actions  (i.e.,  generated  from  primi¬ 
tive  actions  by  the  program  connectives  ; ,  U,  ?,  *),  non-deterministic  effects  and 
qualifications  of  actions.  It  has  also  been  extended  to  represent  concurrent  ac¬ 
tions  [10],  non-execution  of  actions  [8],  indirect  effects  of  actions  [11,26].  Second, 
dynamic  logic  features  a  sound  and  complete  axiomatic  deductive  system  and  a 
well-developed  Kripkean  semantics.  Its  proof  and  model  theory  have  reached  a 
high  degree  of  sophistication  through  the  development  of  theoretical  computer 
science.  Some  features,  such  as  decidability  and  the  finite  model  property  of 
PDL,  and  techniques  such  as  bisimulation  and  filtration,  are  well  understood. 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  240-253,  2001. 
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In  contrast  to  other  formalisms,  such  as  the  situation  calculus  [23]  and  action 
languages  [7],  PDL  does  not  have  a  built-in  solution  to  the  frame  problem  {PDL- 
based  solutions  to  the  frame  problem  have  been  proposed  via  extensions  [2,8,22]). 
In  this  paper,  however,  we  show  that  three  solutions  to  the  frame  problem  (Ped- 
nault’s  syntax-based  approach  [20],  Baker’s  circumscription  [1]  and  the  action 
language  A  [7])  can  be  encoded  in  PDL.  The  relationship  amongst  these  solu¬ 
tions  is  clarified  and  we  prove  that  in  the  case  that  action  descriptions  are  in 
normal  form  and  queries  are  simple,  these  solutions  to  the  frame  problem  are 
essentially  equivalent.  In  contrast  to  the  work  in  [14],  our  results  show  that  the 
equivalence  of  the  solutions  heavily  depends  on  the  syntactical  restrictions  of 
action  descriptions  and  queries. 

Due  to  the  limitation  of  space,  we  omit  all  the  proofs  of  theorems. 

2  Reasoning  about  Action  in  PDL 

In  dynamic  logic,  a  causal  relation  between  an  action  a  and  its  effect  A  is  ex¬ 
pressed  by  a  modal  formula:  [a]j4,  read  as  a  always  causes  A.  For  instance, 
[Shoot]-ialive  represents  “shooting  at  a  turkey  kills  the  turkey”.  The  formula 
(a)  A  reads  as  a  is  executable  and  possibly  causes  A  to  be  true,  where  (a)  is  the 
dual  operator  of  [a].  In  particular,  (a)T  represents  a  is  executable,  where  T 
represents  the  logical  constant  true.  -<  a  y  A  denotes  “(a)T  — ^  mean¬ 

ing  “if  a  is  executable,  then  a  may  cause  A.”  A  language  of  PDL  consists  of  a 
set  Flu  of  fluent  symbols  (propositional  variables)  and  a  set  Actp  of  primitive 
action  symbols.  We  will  use  /,  /i,  /2,  etc.,  to  denote  fluents,  and  use  a,  ai,a2, 
etc.,  for  primitive  actions.  The  formulas  {A  €  Fma)  and  actions  (a  G  Act)  can 
be  defined  as  usual  [18] .  A  formula  which  does  not  include  modal  operators  is 
referred  to  as  a  propositional  formula  ((p  G  Fmap).  The  semantics  and  deductive 
system  of  PDL  can  be  found  in  any  standard  introductory  text  e.g  [18]. 

2.1  Action  Description 

PDL  provides  a  formal  language  to  describe  behaviors  and  internal  relations  of 
a  dynamic  system.  Those  sentences  which  describe  the  generic  effects  of  actions, 
domain  constraints  and  causal  ramifications  are  generally  called  action  descrip¬ 
tion.  In  this  paper,  an  action  description  of  a  dynamic  system  is  any  finite  set 
of  PDL  formulas. 

Example  1  Consider  the  Yale  Shooting  Problem  [12]  described  by  the  following 
action  description: 

-^loaded  [Load]loaded 
loaded  lShoot]-^alive 
loaded  — ^  [Shoot]^loaded 
{Load)T,  {Wait)T,  (Shoot)T 

^  They  are  available  at  http://www.cse.unsw.edu.au/~ksg/Pubs/ksgworking.html. 
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The  first  three  sentences  state  the  effects  of  action  Load  and  Shoot  on  fluent 
loaded  and  alive  (effect  axioms).  The  last  three  represent  the  executability  of 
actions  (qualification  axioms). 

An  action  description  E  is  normal  if  each  formula  in  E  is  of  the  form: 

-  (p  ^  [a]l  (deterministic  action  law) 

~  (p  a  y  I  (non- deterministic  action  law) 

—  (p  {a)T  (qualification  law) 

where  (p  E  Fmap  ,  a  E  Actp  and  /  is  a  fluent  literal.^ 

It  is  easy  to  see  that  the  action  description  in  Example  1  is  normal. 

2.2  Reasoning  with  Action  Description 

A  formula  in  an  action  description  is  different  from  an  ordinary  formula.  The 
sentence  ''loaded  [Shoot]^alive"  states  that  whenever  loaded  is  true,  Shoot 
must  cause  ^alive.  In  the  situation  calculus  this  is  written  as  ys(loaded(s)  — > 
—'alive(do(Shoot^  s)))  instead  of  loaded(s)  -^alive(do(Shoot,  s))  for  some  par¬ 
ticular  situation  s.  A  simple  approach  to  the  problem  in  dynamic  logic,  which 
was  introduced  in  [26] ,  is  to  treat  an  action  description  as  a  set  of  extra  axioms 
of  PDL. 

Definition  1  [26]  Let  E  be  an  action  description.  A  formula  A  is  E-provable, 
written  A  ,  if  it  belongs  to  the  smallest  set  of  formulas  which  contains  all 
theorems  of  PDL  all  elements  of  E^  and  is  closed  under  modus  ponens  and 
modal  generalization  [18]. 

Consider  the  action  description  E  in  Example  1.  We  can  prove  that 
-•loaded  [Load',  Shoot]->alive.^ 

2.3  Consistency  of  Action  Description 

An  action  description  E  is  consistent^  if  \/^  _L,  where  X  represents  logical  con¬ 
stant  “false”.  Let  27  be  a  normal  action  description.  For  any  fluent  /  and  any 
primitive  action  a,  if  we  merge  the  action  laws  about  a  and  /(“•/)  in  each  form 
together,  there  axe  at  most  five  laws  about  a  and  /  in  27: 

(po  (a)T 

<^1,1  ^  [ci]f,  ^1,2  [aj-i/ 

<P2,i  a  y  -•/,  (P2,2  — ay  f 

^  In  [25],  the  normal  form  of  action  descriptions  is  defined  in  a  more  general  version 
to  express  indirect  effects  of  actions  based  on  the  extended  PDL  [26]. 

^  By  using  PDL  axioms  and  Definition  1. 

^  In  [26],  it  is  called  uniformly  consistent,  distinguishing  from  the  consistency  of  normal 
set  of  formulas. 
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It  is  easy  to  see  that  if  <^0?  V^i^and  (^1,2  are  true  simultaneously,  then  the 
action  description  will  contain  a  contradiction.  Similarly  for  and  (p2j 

(j  —  1  or  j  —  2).  We  call  a  normal  action  description  E  is  safe  if  it  is  satisfies 
the  following  assumption: 

h  -1(^0  V  -x/?!,!  V->(^i,2  and  h  ~^(po  V  V^<P2,j  (j  =  1, 2^ 

The  following  theorem  shows  that  the  safety  is  a  sufficient  condition  of  the 
consistency  of  normal  action  descriptions. 

Theorem  1  [25] Lei  E  be  a  normal  action  description.  If  E  is  safe,  then  it  is 
consistent. 

Since  the  action  description  in  Example  1  is  safe,  it  is  consistent. 

We  remark  that  the  normal  form  is  quite  expressive  though  not  every  action 
description  can  be  expressed  in  normal  form.  Any  action  description  written 
in  the  form  of  pre-condition  axioms  and  successor  state  axioms  in  the  proposi¬ 
tional  situation  calculus  language  (that  is,  there  are  no  sort  object  and  function 
symbols  in  the  language  [23])  can  be  translated  into  normal  form  and  moreover 
the  resultant  action  descriptions  are  safe.  Action  descriptions  written  in  A  or  in 
STRIPS  can  also  be  expressed  in  normal  form.  Additionally,  the  determinism  of 
action  (i.e.,  for  any  initial  state  there  exists  one  and  only  one  next  state)  can  be 
expressed  by  normal  form. 

3  Properties  of  PDL  Models 

We  now  present  some  special  properties  of  PDL  models  which  are  not  included 
in  the  standard  discourse  of  dynamic  logic  but  are  useful  for  the  purpose  of  the 
paper. 

3.1  PDL  Models 

A  model  for  a  PDL  language  is  a  structure  of  the  form  M  =  (W,  {Ra  :  a  e 
Actp},  V)^  with  Ra  a  binary  relation  on  W  for  each  primitive  action.  Note  that 
we  only  consider  the  accessibility  relations  of  primitive  actions.  Those  for  com¬ 
pound  actions  can  be  defined  by  using  the  standard  model  conditions  [18].  The 
satisfiability  relation  is  defined  as  usual.  A  model  M  satisfying  a  formula  A  in 
world  w  is  denoted  M  A.  A  is  valid  in  M,  denoted  by  M  A,  if  M  1=^^  A 
for  all  w  e  W.  Let  E  be  an  action  description.  A  model  M  is  a  E-model  if 
M  [=  A  for  any  A  G  E.  Intuitively,  a  model  is  a  Z'-model  if  E  is  true  in  ev¬ 
ery  world  of  the  model;  Mod{E)  denotes  the  set  of  all  Z'-models.  In  [26],  it  is 
shown  that  for  any  action  description  E,  A  iff  A  is  valid  in  all  E -models. 
We  now  investigate  models  which  are  relevant  to  the  models  of  action  language 
and  situation  calculus. 

Definition  2  A  model  M  =  (W^K^V)  is  saturated  if  for  each  interpretation 
I  of  Flu,  there  exists  w  G  W  such  that  M  I.  We  use  Mods(E)  to  denote 
the  set  of  all  saturated  i7-models. 
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Proposition  1  If  U  is  normal  and  safe,  then  A  iff  M  ^  A  for  any  M  E 
ModsiE). 

We  will  show  in  section  6  that  the  saturation  of  PDL  models  corresponds  to 
the  Existence  of  Situation  Axioms  (ESA)  [1].  Note  that  Proposition  1  depends  on 
the  definition  of  the  normal  form  of  action  description.  If  we  allow  a  normal  action 
description  to  describe  domain  constraints  or  indirect  effects,  this  proposition 
will  cease  to  hold. 

Definition  3  A  model  M  =  (W,  V)  is  natural  if 

1.  W  is  the  set  of  all  interpretations  of  Flu, 

2.  for  any  /  E  Flu,  w  €V(f)  iS  f  G  w. 

We  denote  the  set  of  ail  natural  Z'-models  by  Modpf{E). 

It  is  easy  to  see  that  any  natural  model  is  saturated. 

Proposition  2  If  E  be  normal  and  safe,  then  \-^  (p  —>  [ai;  •  •  • ;  an]^  iff  M  |= 
V?  [ai;  •  ♦  • ;  a„]/  for  any  M  E  ModN{E). 

A  formula  in  the  form  ip  — +  [ai;'-';an]/  is  referred  to  as  a  simple  query, 
where  ^  is  a  propositional  formula,  I  a  literal.  Notice  that  Proposition  2  is  only 
true  for  simple  queries.  For  instance,  let  X*  =  0,  Flu  =  {/}  and  Actp  =  {a}.  Let 
A  =  fV  ^  ay  ->/V  ^  a  y^  ay  f\/  <  a  > — <  a  y<  ay  f.  Then  A  is  valid  in  all 
the  natural  models  but  A.  This  proposition  is  a  key  lemma  of  Theorem  2. 

Definition  4  A  model  M  ~  iyV,'R.,V)  is  functional  if  for  any  a  E  Actp,  Ra  is 
a  function  on  W,  We  denote  the  set  of  all  natural  functional  X-models  by 
AIodNF{E). 

The  syntactical  condition  with  respect  to  functional  models  is  so-called  de¬ 
terminism,  which  means  that  each  state  can  have  and  only  have  one  next  state 
after  an  action. 

Definition  b  Let  E  =  {{a)f  ->  [a]f  :  a  E  Actp  and  f  E  Flu}  U  {{a)T  :  a  E 
Actp}.  An  action  description  is  deterministic^  if  E  \-  S. 

Note  that  (a)f  —>  [a]f  can  be  expressed  in  normal  form  in  the  following 
way:  f  — ►  [a]f,  -•/'  ^  where  f  is  a  new  fluent  symbol  (in  most  cases,  we 

can  put  the  descriptions  of  determinism  and  effects  of  actions  together  without 
introducing  new  fluent  symbols). 

Proposition  3  Let  E  be  normal  and  safe.  If  E  is  deterministic,  then  A  iff 
M  \=  A  for  any  M  E  Mod^riE). 

Note  the  difference  between  Proposition  2  and  3.  We  can  relax  the  restriction 
of  simple  query  at  the  price  of  allowing  only  deterministic  action  descriptions. 

^  Here  we  assume  that  a  deterministic  action  is  always  executable  for  simplicity.  It 
can  be  relaxed  at  the  price  of  a  more  complex  formalization. 
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3.2  Minimizing  PDL  Models 

Let  M  -  {W,n,  V)  be  a  PDL  model.  For  any  w  eW,  let  ||u;||  -  {/  €  Flu  : 
M  \=^  f}\J  {->/  :  /  €  Flu  &  M  /}.  We  denote  Chg{M)  =  {{a,f,w)  : 
3w'{u)Raw’  &  /  €  (||ty||\ik'||)U(||t(;'||\||w;|l)}.  In  words,  {a,  f,w)  €  Chg{M)  iff 
there  exists  an  accessible  world  to  w  on  action  a  such  that  the  truth  value 
of  /  is  different  at  w  and  w'. 

Definition  6  For  any  Mi,  M2  €  Mod{E),  Mi  C  M2  iff 

1.  Wi  =  W2, 

2.  Vi{f)  -  V2{fl 

3.  Chg{Mi)  c  Chg{M2)^ 

We  denote  the  set  of  C-minimal  models  in  Mod(Z*)  as  min(Mod(i7)).  Intu¬ 
itively,  Ml  C  M2  means  Mi  has  lesser  state  change  than  M2  ■ 

4  Pednault’s  Solution  to  the  Frame  Problem 

We  first  encode  Pednault’s  syntax-based  solution  [20]  to  the  frame  problem  in 
PDL.  Before  doing  this,  let’s  recall  the  meaning  of  the  frame  problem. 

To  formalize  the  effects  of  actions  in  a  dynamic  system,  it  is  necessary  to 
provide  all  the  effect  axioms  of  actions  (which  specify  what  is  affected  by  actions). 
Often  this  is  easy  because  most  actions  affect  only  a  few  of  the  relevant  fluents. 
In  contrast,  listing  all  the  frame  axioms  (which  specify  what  is  not  affected  by 
actions)  is  tedious.  Moreover,  they  are  much  more  numerous  than  effect  axioms. 
For  instance,  in  Example  1,  only  effect  axioms  were  listed.  There  are  nine  frame 
axioms,  such  as  alive  — >  [Load]alive,  loaded  — >•  [Wait]loaded  etc.,  that  were  not 
listed.  Without  these  axioms,  the  action  description  is  incomplete.  We  cannot 
even  establish  the  intuitive  assertion  alive  — >  [Load\alive.  The  frame  problem 
is  how  to  invent  an  inference  mechanism  for  reasoning  about  effect  of  action  with 
incomplete  action  descriptions. 

Pednault  [20]  introduced  an  approach  to  the  frame  problem  with  which  frame 
axioms  can  be  automatically  generated  from  effect  axioms  and  qualification  ax¬ 
ioms.  Consider  an  normal  action  description  D  without  non-deterministic  action 
laws.  Suppose  that  the  positive  and  negative  effect  axioms  and  qualification 
axioms  about  action  a  and  fluent  /  in  an  action  description  D  are: 

(po  {a)T,  (pi  [a]/,  (p2  [a]-*/. 

According  to  the  Completeness  Assumption  [23],  we  have  the  following  frame 
axioms: 

FA^j  :  V  -11^2)  A  /  ->  [a]/ 

FA~  j  :  (-1(^0  V  A  -1/  ^  [a]-i/ 

All  frame  axioms  generated  by  this  procedure  are  referred  to  as  the  frame 
axioms  with  respect  to  S.  For  instance,  —>loadedAalive  — >  [Shoot]alive  is  a  frame 
axiom  about  Shoot  and  alive  with  respect  to  the  action  description  in  Example 


246  Norman  Foo  et  al. 


1.  Suppose  that  A  is  the  set  of  all  the  generated  frame  axioms  with  respect  to 
r.  Then  we  are  able  to  prove  that  {->loaded}  [Load;  Wait;  Shoot]~^alive. 

In  general,  given  a  set  E  of  effect  axioms,  we  generate  all  the  frame  axioms 
with  the  above  procedure.  Let  A  be  all  the  generated  frame  axioms.  Then  EUA 
will  be  the  complete  action  description  with  respect  to  E.  Therefore,  to  answer 
a  query  A,  we  only  have  to  make  the  inference  A. 

The  following  theorem  establishes  the  semantic  condition  for  Pednault’s  solu¬ 
tion.  It  also  gives  the  relationship  between  syntax-based  and  minimization-based 
approaches. 

Observation  1  Let  E  be  a  normal  action  description  without  non- deterministic 
action  laws.  If  E  is  safe,  then  A  iff  M  \=  A  for  any  M  €mm(Mods{E)) 
where  A  is  the  set  of  frame  axioms  with  respect  to  E. 

It  is  not  hard  to  extend  Pednault’s  solution  to  non-deterministic  case. 

5  Encoding  the  Action  Language  *4,  in  PDL 

The  action  languages  [6]  offer  a  simple  and  elegant  solution  to  the  frame  problem. 
In  this  section,  we  show  that  the  action  language  A  can  be  embedded  into  PDL. 
Our  approach  can  also  be  extended  to  the  action  language  3  and  C  if  we  base 
on  the  extended  propositional  logic  (EPDL)  [26]. 

An  action  description  E  in  the  language  A  [6]  is  a  set  of  expressions  of  the 
form:  a  causes  I  if  (p,  where  a  is  a  primitive  action,  I  is  a  fluent  literal,  and  p 
is  a  conjunction  of  literals.  The  state  of  a  dynamic  domain  is  expressed  by  a  set 
of  axioms  of  the  form:  now  1.  A  query  in  action  language  A  is  an  expression 
of  the  form:  necessarily  p  after  ai,  •  ♦  • ,  an,  where  is  a  propositional  formula 
and  ai,  •  •  • ,  an  are  primitive  actions. 

A  structure  T  =  (W,  xW  :  a  ^  Actp},  V)  is  a  transition  system  of 

an  action  description  E  if 

1.  IF  is  the  set  of  all  interpretations  of  Flu, 

2.  F  is  a  function  from  Flu  to  2^  such  that  /  E  V{w)  iff  /  E  ly. 

3.  {w,  w')  E  Ra  iff  E{a,  w)  Cw'  Q  E{a,  w)  U  w,  where  E{a,  w)  is  the  set  of 
the  head  I  of  all  expression  “a  causes  I  if  in  E  such  that  w  satisfies  (p. 

Let  T*  be  a  set  of  expressions  in  the  form:  now  1.  A  query  “necessarily  p 
after  ai,  •  •  •  ,an”  is  a  consequence  of  T  in  T  if,  for  any  chain  (wo,Wi)  E  Ra^, 

•  •  {wn~i,Wn)  E  Ra^y  whenever  satisfies  I  for  each  now  I  G  r,Wn  satisfies 
p.  A  query  “necessarily  p  after  ai ,  •  •  • ,  an  is  a  consequence  of  F  with  respect 
to  an  action  description  E  if  it  is  a  consequence  of  F  in  any  transition  system 
of  E. 

According  to  the  translation  between  A  and  PDL  shown  in  the  Appendix,  we 
can  easily  transform  an  action  description  and  a  state  description  between  two 
languages.  Since  such  a  translation  is  one-to-one,  we  will  only  use  PDL  language 
to  describe  action  descriptions,  initial  states  and  queries.  They  are  easily  recog¬ 
nized  with  context.  It  is  easy  to  see  that  an  action  description  in  language  A  is 
always  normal  and  safe.  There  is  an  important  difference  between  the  semantics 
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of  action  language  and  PDL.  In  A,  there  is  no  explicit  expression  for  qualifica¬ 
tion  of  actions.  An  underlying  assumption,  called  Qualification  Completeness,  in 
the  semantics  is  that  an  action  is  always  executable  unless  the  action  description 
implies  that  it  is  not  In  PDL,  there  is  no  such  assumption.  Thus  qualification 
of  actions  must  be  explicitly  specified. 

Let  Z*  be  a  finite  action  description  in  A.  Suppose  the  action  laws  about  an 
action  a  and  fluent  /  Eire  (pi  [a]f  and  (p2  — >  This  implies  that  a  is 

not  executable  when  (pi  A  ^2-  Collecting  all  the  conditions  of  non-executability: 
(piAipl,  •  •  •,  (PiA(p2,  we  know  that  a  is  not  executable  if  (<^i  Av?2)V*  •  •V(v?J  A(/?2)- 
By  Qualification  Completeness,  we  assume  that  A</?2)/\'  * ' ^^2)) 
{a)T;  such  a  condition  is  an  induced  qualification  law  .  Let  A  be  the  set  of  all 
such  laws  from  E,  Then  we  have 

Observation  2  Let  E  be  a  finite  action  description  and  P  a  finite  set  of  axioms, 
both  in  A.  A  query  ^Necessarily  (p  after  ai,  -  -  •  ,an”  is  a  consequence  of  P  with 
respect  to  E  iff  (A^)  [^i;  *  •  *  where  A  is  the  frame  axioms 

with  respect  to  EU  A. 

Clearly,  the  expressive  power  of  A  is  quite  restricted.  Action  descriptions  can 
only  be  normal.  And  queries  can  only  be  simple  in  our  terminology. 

6  Encoding  Baker’s  Solution  in  PDL  Models 

Finally  we  consider  Baker’s  solution.  First,  we  have  to  recall  the  basic  assumption 
of  the  approach. 


6.1  Models  of  Situation  Calculus 

A  model  of  the  situation  calculus  [1,19],  (an  iSC-model),  consists  of  the  various 
domains:  the  domain  of  situations  lAdjs,  the  domain  of  actions  \M\a  and  the 
domain  of  fluents  |A4|/;  as  well  as  interpretations  for  the  constants: 

1.  Interpretations  for  the  relations  Holds  and  Ab: 

Holds^  C  |A4|/  X  |M|„  Ab^  C  \M\a  x  \M\f  x  \M\s, 

2.  Interpretation  for  the  Result  function: 

Result-^  e  (|A4la  X  — >■  |A4|s)- 

The  following  axioms  were  used  in  Baker’s  circumscriptive  solution  to  the 
frame  problem: 

1.  Unique  names  axioms: 

—  Unique  Name  Axioms  for  fluents  (UNAF):  for  any  /i,  /2  €  Flu,  fii^  f 2- 
-  Unique  Name  Axioms  for  Actions(UNAA):  for  any  ai,  a2  €  Flu,  ai  ^  ^2- 

2.  Commonsense  Law  of  Inertia  (CLI): 

-iAb{a,f,s)  {Holds{f,  Result  {a,  s))  Holds{f,s)) 

3.  Domain  Closure  Axioms: 
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—  Domain  Closure  Axiom  for  Fluents  (DCAF): 

/  =  /iV/  =  /2V---V/  =  /nV.-. 

-  Domain  Closure  Axiom  for  Actions  (DCAA): 
a  —  alVa  =  a2V•••Va  =  anV••• 

4.  Existence  of  Situation  Axioms  (ESA): 

^s{Holds{fi,s)  A  Holds{f2,  s)  A  ♦  •  •  A  Holds{fn,  5)  A  •  •  •) 

3s{Holds{f\^s)  A  -^Holds{f2,s)  A  •  •  ♦  A  Holds{fny  s)  A  *  •  •) 

3s{~^Holds{fi,s)  A  ^Holds{f2,  s)  A  •  ♦  ♦  A  -^Holds{U,  5)  A  •  •  •) 

For  the  sake  of  simplicity,  we  omit  the  formal  presentation  of  domain  closure 
and  existence  of  situation  axioms  and  ignore  language  differences  in  the  rep¬ 
resentation  of  an  action  description  based  on  the  translation  in  the  Appendix. 
Therefore,  Z*  is  an  action  description  in  the  situation  calculus  if  it  is  a  translation 
from  an  action  description  in  PDL.  Furthermore,  an  SC  model  A4  is  a  E-model 
if  M.  satisfies  all  the  formulas  in  E, 


6.2  Relations  between  SC  Models  and  PDL  Models 
First,  we  translate  an  SC  model  to  a  PDL  model. 

Definition  7  Let  M  be  an  SC  model.  A  PDL  model  M  ==  {W.Hy)  is  the 
corresponding  model  of  M.  if 
LW  =  \M\s^ 

2,  (sf  ,s^)  G  Ra  iff  -  Result^ 

3,  G  V[f)  iff  Holds^{f^,s^y 

Lemma  1  Let  M  =  (W,  TZ,  V)  he  a  PDL  model  and  Ad  the  corresponding  model 
of  M.  Then 

1.  M  (p  iff  Ad  [=  Holds{(p,s). 

2.  M  is  functional. 

3.  If  Ad  satisfies  the  common  sense  law  of  inertia,  then  (o,  /,  G  ChqiM) 
iff{a^J^.s^)eAh^. 

4‘  If  E  is  a  normal  action  description,  then  M  is  a  E -model  iff  Ad  is  a 
E -model. 

5.  If  Ad  satisfies  Existence  of  Situation  Axioms,  then  M  is  saturated. 

Next,  we  consider  the  transformation  of  PDL  models  to  SC  models. 

Definition  8  Let  M  =  (FF,  7?.,  V)  be  a  functional  PDL  model.  An  SC  model 
Ad  is  the  corresponding  model  of  M  if 

1.  \Ad\f  =  Flu,  \Ad\a  =  Actp,  \Ad\s  =  W. 

2.  s'  =  Result^{a,s)  iff  {s,s')  G  Ra^ 

3.  {f,s)eHolds^mf£V{s). 

I  {a,f,s)  G  Ah^  iff  (a,/,s)  G  Chg(M). 
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Lemma  2  Let  M.  be  the  corresponding  model  of  a  functional  PDL  model  M  = 
(W,7^,V).  Then 

1.  jM  1=  Holds(ip,  s)  iff  M  \=sM  (p 

2.  M.  satisfies  the  Commonsense  Law  of  Inertia. 

3.  M  satisfies  Domain  Closure  Axioms  for  Fluents  and  Actions. 

4-  M.  satisfies  Unique  Names  Axioms  for  Fluents  and  Actions. 

5.  If  E  is  a  normal  action  description,  then  M.  is  a  D -model  iff  M  is  a 
E -model 

6.  If  M  is  saturated,  then  fA  satisfies  Existence- of- Situation  Axiom. 

The  following  shows  the  relationship  between  SC  models  and  PDL  models. 

Lemma  3  Let  M  be  a  functional  PDL  model  If  M  is  the  corresponding  model 
of  M,  then  M  is  the  corresponding  model  of  M.  Conversely,  suppose  that  M  is 
an  SC  model  and  M  the  corresponding  model  of  M.  If  M  satisfies: 

1.  Domain  Closure  Axioms  for  Fluents  and  Actions, 

2.  Unique  Names  Axioms  for  Fluents  and  Actions, 
then  M.  is  the  corresponding  model  of  M. 

6.3  Relationship  between  Baker’s  Circumscription  Policy  and 
PDL-Model-Based  Minimization 

Since  the  Holds  function  cannot  be  nested,  not  every  formula  in  PDL  can  be 
translated  into  the  situation  calculus  language.  We  call  an  action  description  is 
SC-expressible  if  it  can  be  translated  into  situation  calculus  language. 

Observation  3  Let  E  be  a  deterministic  and  SC- expressible  action  description. 

1.  M  €min(ModF{E))  if  and  only  if  its  corresponding  SC  model  is  a  model 
of  CIRCUM{E  U  Ab;  Result). 

2.  M  is  a  model  of  CIRCUM{E  U  Ab;  Result)  if  and  only  if  its  corre¬ 
sponding  model  in  minfModpiE)). 

where  ^  is  the  set  of  UNAF,  UNAA,  DCAF  and  DCAA. 

Note  that  the  action  description  in  the  observation  is  not  necessarily  normal. 
However,  if  we  impose  syntactical  restrictions  on  action  description  and  queries, 
we  can  prove  that  all  the  solutions  to  the  frame  problem  we  considered  thus  far 
are  equivalent.  This  result  corresponds  to  the  one  in  [14]. 

Corollary  1  Let  E  be  a  normal  action  description,  F  a  finite  set  of  literals.  If 
E  is  deterministic  and  safe,  then  the  following  statements  are  equivalent: 

1.  (/\  r)  ^  [ai;  •  •  • ;  an\l,  where  A  is  the  set  of  frame  axioms  with  respect 
to  E. 

2.  For  any  model  M  emin(ModNF{H)),  M  \=  {/\r)  [ai;  •  •  • ;  an\l- 

3.  ^‘necessarily  I  after  ui,  •  •  • ,  an  ”  is  a  consequence  of  F  with  respect  to  E. 

4.  CIRCUM{E  U  Ab;  Result)  (=  ys{Holds{{/\  F),s)^ 

Holds{l,  Result{a\,  ,an,  s))). 


where  ^  is  the  set  of  UNAF,  UNAA,  DCAF,  DCAA  and  ESA. 
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7  Conclusion  and  Discussion 

We  have  encoded  three  typical  solutions  to  the  frame  problem:  Pednault’s 
syntax-based  solution,  Baker’s  circumscription  and  Gelfond  &  Lifchitz’s  action 
language  A,  in  PDL  in  either  syntax  or  semantics.  Three  observations  have 
been  given  which  show  the  formal  relationships  among  these  solutions,  which 
are  helpful  for  a  fuller  and  deeper  understanding  of  the  frame  problem  and  the 
associated  solutions.  As  a  corollary  of  these  observations,  we  know  that  for  nor¬ 
mal  and  safe  action  descriptions  and  simple  queries,  all  the  solutions  to  the  frame 
problem  are  equivalent.  This  corresponds  to  the  result  in  [14],  where  Pednault’s, 
Reiter’s  and  Baker’s  solutions  to  the  frame  problem  were  compared  based  on 
action  language  A.  A  crucial  difference  between  Kaxtha’s  result  and  ours  is  the 
following.  Action  language  A  is  the  least  expressive  language  among  the  for¬ 
malisms  of  action.  Under  its  restrictions  we  cannot  see  the  difference  among  the 
solutions  (Corollary  1).  In  contrast,  dynamic  logic  is  the  highest  with  regard  to  a 
certain  level  (propositional  or  first-order).  This  makes  a  systematic  comparison 
of  formalisms  on  action  possible.  Additionally,  the  soundness  and  completeness 
of  dynamic  logic  bridge  the  syntax  and  semantics,  which  makes  the  unification 
of  different  approaches  possible. 

With  help  of  the  formal  results  in  the  paper,  we  would  like  to  make  the 
following  remarks: 

Syntactical  restrictions:  The  equivalence  among  the  solutions  to  the  frame 
problem  relies  heavily  on  the  syntactical  restrictions  on  action  description  and 
queries.  For  instance,  if  L  is  not  normal,  the  validity  of  a  formula  A  in  all  the 
natural  saturated  i7-models  does  not  guarantee  \~^  A.  Thus  the  link  between 
the  Z'-provability  in  PDL  and  provability  from  transition  systems  of  action 
language  A  will  not  exist.  Additionally,  the  form  of  queries  is  also  crucial  to 
the  equivalence.  Fortunately,  the  link  between  minimizing  PDL  models  and 
minimizing  SC  models  does  not  depend  on  the  normality  of  action  description. 

Extensibility  of  action  formalisms:  Each  formalism  of  action  has  been  or 
is  intended  to  be  extended  to  accommodate  non- deterministic  and  indirect  ef¬ 
fects  of  actions  and  compound  and  concurrent  actions.  Compatible  extensions  of 
these  formalisms  will  approximate  dynamic  logic  in  expressiveness.  For  instance, 
to  extend  A  to  express  general  queries  requires  transition  systems  to  allow  “non¬ 
natural”  models  according  to  Proposition  2.  Currently,  to  express  programs  or 
compound  actions,  dynamic  logic  might  be  the  best  formalism  among  the  exis¬ 
tent  ones. 

Epistemic  minimization  and  physical  minimization:  We  know  that 
Baker’s  circumscriptive  policy  (varying  Result)  corresponds  exactly  to  the  min¬ 
imization  of  PDL  models.  We  may  remember  that  we  took  a  detour,  varying 
Holds,  before  we  reached  the  “right  solution”:  state-minimization  [24].  Such  a 
detour  does  not  seem  necessary  in  PDL  models  or  transition  systems.  There  is 
a  subtle  difference  between  circumscriptive  first-order  models  and  minimizing 
PDL  models.  With  circumscription  we  minimize  abnormality  whereas  in  PDL 
we  minimize  change  of  worlds.  We  refer  to  the  former  kind  of  minimization  as 
to  be  epistemic  and  the  latter  as  to  be  physical 
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Action-oriented  frame  problem:  We  have  considered  Pednault’s  syntax- 
based  solution  and  Baker’s  model-based  solution  to  the  frame  problem.  How¬ 
ever,  these  solutions  only  work  for  the  so-called  fluent- oriented  frame  problem 
(see  [13]).  A  remaining  challenge  is  to  encode  the  solutions  to  the  action-oriented 
frame  problem,  i.e,  how  to  make  it  the  default  in  PDL  that  only  actions  men¬ 
tioned  in  the  action  description  have  effects.  A  typical  approach  to  the  action- 
oriented  frame  problem  is  using  action  variables  to  range  over  all  actions  which 
have  effects.  A  compact  representation  of  frame  axioms  can  then  be  offered  by 
using  the  Explanation  Closure  Assumption  and  quantifying  over  action  vari¬ 
ables  [23,13].  Such  an  approach  cannot  be  encoded  in  PDL  because  there  are 
no  action  variables  (even  in  first-order  versions).  In  [5],  it  was  shown  that  given 
a  normal  and  safe  action  description  Z",  if  A,  where  A  is  the  set  of  all 

the  frame  axioms  with  respect  to  then  there  is  a  subset  A'  of  A  such  that 
all  the  action  symbols  occur  in  A'  occur  in  A.  This  means  that  if  we  postpone 
listing  frame  axioms  till  a  query  arises,  frame  axioms  in  which  the  actions  are 
irrelevant  to  the  query  are  not  needed  for  answering  the  query.  Therefore,  the 
action-oriented  frame  problem  is  not  a  problem  in  this  sense. 


Appendix:  Translations  between  Languages 

We  now  provide  an  intertranslation  between  dynamic  logic,  situation  calculus 
and  action  languages.  This  intertranslation  is  not  formal.  For  instance,  a  fluent 
symbol  stands  for  a  proposition  in  PDL  but  is  an  individual  in  situation  calcu¬ 
lus.  Again,  Holds{(p^  Sq)  make  sense  only  in  the  extended  predicate  of  Holds. 
Additionally,  all  these  translations  depend  on  the  semantics  of  the  associated 
action  logics. 


1.  Expressions  for  describing  initial  state: 


Dynamic  Logic 

Situation  Calculus 

Action  Language  A 

/ 

Holds(f,So) 

now  / 

-yHolds{f,  So) 

now  -i/ 

V 

Holds{<p^  So) 

2.  Expressions  for  describing  queries 


Dynamic  Logic 

Situation  Calculus 

Action  Language  A 

jlfTTBBTPM 

Holds{(pj  Result{[aij  •  •  •  ,an],  s)) 

ip  after  oi,  •  •  • ,  On 

3.  Expressions  for  describing  causation  between  propositions 


Dynamic  Logic 

Situation  Calculus 

Action  Language  B 

Holds{(p,s)  Caused{'ip,true,  s) 

-ip  if  (p 

4.  Expressions  for  describing  domain  axioms: 


Dynamic  Logic 

Situation  Calculus 

Action  Language  Aor  C 

(p  — ^  [a]l 

ip  — ^  (u)T 

\/s(Holds{(p,  s)  — >  Holds{L,  Result{a^  s))) 

'is{Holds{ipjS)  — >  Poss{a,s)) 

a  causes  1  if  (p 
a  may  cause  1  if  cp 
executable  a  if  ^ 
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Abstract.  The  language  S  for  reasoning  about  actions  and  change  can 
be  translated  into  an  argumentation  framework.  In  this  paper,  we  extend 
this  translation  of  the  basic  language  and  show  how  it  can,  together  with 
methods  from  abduction,  form  the  basis  for  a  principled  implementation 
of  £.  The  extension  we  have  considered  concerns  the  addition  of  new 
type  of  sentences  in  the  language  as  well  as  allowing  theories  where  the 
narrative  of  events  given  is  incomplete. 

A  system,  called  £^-RES,  is  developed  within  the  argumentation  frame¬ 
work  of  Logic  Programming  without  Negation  as  Failure  (LPwNF).  This 
can  support  directly  a  variety  of  modes  of  common  sense  reasoning  such 
as:  default  persistence  in  credulous  or  sceptical  form,  assimilation  of  ob¬ 
servations  and  their  diagnosis  possibly  under  incomplete  information,  as 
well  as  combinations  of  these.  To  improve  the  efficiency  of  the  system 
we  have  considered  the  integration  of  a  SAT  solver  within  the  LPwNF 
computation,  to  carry  out  the  of  validating  the  time  universal  constraints 
imposed  by  ramification  statements. 


1  Introduction 

General  formalisms  of  action  and  change  can  provide  a  natural  framework  for 
a  variety  of  AI  problems  such  as  diagnosis,  planning  and  cognitive  robotics. 
They  can  offer  a  high  level  of  expressivity  and  a  basis  for  the  development  of  a 
computational  framework  to  solve  these  problems.  In  this  paper  we  study  how 
one  such  formalism,  the  Language  £  [10],  can  be  developed  into  a  framework 
capable  of  supporting  a  variety  of  basic  reasoning  modes  needed  to  address  this 
type  of  AI  applications. 

The  computational  foundation  of  this  framework  and  its  associated  system, 
called  5-RES,  is  a  re-formulation  of  the  Language  S  in  terms  of  argumenta¬ 
tion  [2] ,  within  the  framework  of  Logic  Programming  without  Negation  as  Failure 
(LPwNF)  [3],  together  with  a  synthesis  of  methods  from  abductive  reasoning  [9]. 
This  allows  a  principled  implementation  of  the  5- RES  system  in  a  way  that  sep¬ 
arates  issues  of  expressiveness  and  efficiency.  It  is  then  possible  to  examine  how 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  254-266,  2001. 
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we  can  use  in  a  modular  way  ’’external”  solvers,  e.g.  a  SAT  solver  [6]  or  a  no¬ 
tion  of  relevancy  of  part  of  the  theory  to  the  goal  at  hand,  for  improving  the 
computational  behaviour  of  the  framework. 

2  The  Language  E  and  Its  Model  Semantics 

The  vocabulary  of  the  Language  S  consists  of  a  set  ^  of  fluent  constants,  a 
set  of  action  constants,  and  a  partially  ordered  set  (77,  :;<)  of  time-points.  This 
vocabulary  depends  each  time  on  the  domain  being  modeled.  A  fluent  literal  is 
either  a  fluent  constant  F  or  its  negation  ~^F.  In  the  current  implementation 
of  the  5-RES  system  the  only  time  structure  that  is  supported  is  that  of  the 
natural  numbers,  so  we  restrict  our  attention  here  to  domains  of  this  type. 

Domain  descriptions  in  the  Language  S  are  collections  of  the  following  kinds 
of  statements  (where  A  is  an  action  constant,  T  is  a  time-point,  F  is  a  fluent 
constant,  L  is  a  fluent  literal  and  C  is  a  set  of  fluent  literals): 

—  t-propositions:  L  ho  Ids -at  T 

—  h-propositions:  A  happens -at  T 

—  c-propositions:  A  initiates  F  when  C,  or  A  terminates  F  when  C 

—  r-propositions:  L  whenever  C 

—  p-propositions:  A  needs  C. 

T-propositions  are  used  to  record  observations  that  particular  fluents  hold  or 
do  not  hold  at  particular  time-points.  H-propositions  are  used  to  state  that 
particular  actions  occur  at  particular  time-points.  C-propositions  state  general 
“action  laws”  -  the  intended  meaning  of  “A  initiates  F  when  C”  is  “C  is 
a  minimally  sufficient  set  of  conditions  for  an  occurrence  of  A  to  initiate  F”. 
R-propositions  serve  a  dual  role  in  that  they  describe  both  static  constraints 
between  fluents  and  ways  in  which  fluents  may  be  indirectly  affected  by  action 
occurrences.  P-propositions  state  necessary  conditions  for  an  action  to  occur. 

The  semantics  of  S  is  based  on  a  notion  of  a  model  of  a  domain  D.  A  map, 
H  :  0  X  n  {true,  false},  is  an  interpretation  of  D.  Given  a  time  point  T 
and  a  fluent  constant  F  we  first  define  the  notion  of  an  initiation-point  {termi¬ 
nation-point  resp.)  for  F  in  H  relative  to  D  as  follows.  Consider  first  the  case 
where  D  contains  no  r-propositions.  Then  T  is  an  initiation-point  (termina¬ 
tion-point  resp.)  for  F  in  77  relative  to  D  iff  there  is  an  action  constant  A  such 
that  (i)  D  contains  both  an  h-proposition  A  happens -at  T  and  a  c-proposition  A 
initiates  (terminates,  resp.)  F  when  C,  and  (ii)  77  satisfies  C  at  T  (i.e  for  each 
F  eC,  H{F,T)  =  true,  and  for  each  F'  such  that  -iF'  G  C,  H{F',T)  =  false). 

When  the  domain  D  contains  r-propositions  this  definition  has  to  be  ex¬ 
tended  to  allow  for  initiation  or  termination  points  that  axe  generated  recursively 
through  such  these  r-propositions. 

Definition  1.  (Initiation/termination  point)  Let  77  be  an  interpretation  of  S, 
and  D  be  a  domain  description.  Let  W  be  the  set  2^^^  x  2^^^  and  let  the 
operator  F  :  W  ^  W  be  defined  as  follows.  For  each,  {Ih,  %)  G  W  denote 
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TiXBfi,  Te))  by  {lh\  Te^) .  Then  for  any  Fe^  and  Ten,  (F,  T)  is  in  Tn!  (resy. 
in  Te! )  iff  one  of  the  following  two  conditions  holds. 

1.  There  is  an  A  e  A  s.t.  (i)  there  is  both  an  h-proposition  in  D  of  the  form 
“A  happens-at  T”  and  a  c-proposition  in  D  of  the  form  ‘A  initiates  F 
when  C”  (resp.  "A  terminates  F  when  C”)  and  (ii)  H  satisfies  C  at  T. 

2.  There  is  an  r-proposition  in  D  of  the  form  "F  whenever  C”  (resp.  "-iF 

whenever  C”)  and  a  partition  {Ci,C2}  ofC  such  that  (i)  Ci  is  non-empty, 
for  each  fluent  constant  F'  e  Ci,  {F',T)  e  Tn,  and  for  each  fluent  literal 
-iF'  e  Cl,  (F',r)  G  %,  and  (ii)  there  is  some  e  n,T  <  T2,  such  that 

for  all  Ti,  T  :<  Ti  :<  T2,  H  satisfies  C2  at  Ti. 

Let  (Tn^  ,Te^)  be  the  least  fixed  point  of  the  (monotonic)  operator  F  starting  from 
the  empty  tuple  (0,0).  T  is  an  initiation-point  (resp.  termination-point)  for  F 
in  H  relative  to  D  iff  (F,  T)  G  Th^  (resp.  (F,  T)  G  Te^ ). 

It  is  useful  to  note  that  any  initiation  or  termination  point  at  some  time  T 
relative  to  D,  defined  in  this  way,  must  refer  to  at  least  one  known  h-proposition 
at  r  in  the  domain  D. 

Given  this  notion  of  an  initiation  and  termination  point  then  an  interpreta¬ 
tion  H  Isa.  model  of  D  iff,  for  every  fluent  constant  F  and  time-points  Ti  Ts: 

1.  If  there  is  no  initiation-  or  termination-point  T2  for  F  in  such  that  Ti  :< 

T2  Ts,  then  F(F,ri)  =  H{F,T3). 

2.  If  Ti  is  an  initiation-point  for  F  in  H,  and  there  is  no  termination-point  T2 
for  F  in  iJ  such  that  Ti  <T2  <Tz,  then  H{F,  T3)  ~  true. 

3.  If  Ti  is  a  termination-point  for  F  in  H,  and  there  is  no  initiation-point  T2 
for  F  in  F  such  that  Ti  ^  r2  Fa,  then  H{F,Ts)  =  false. 

4.  H  satisfies  the  following  constraints: 

-  For  all  F  holds~at  T  in  F,  H{F,  T)  -  true,  and  for  all  “-iF  holds-at 
T'”  in  F,  H{F,r)  =  false. 

—  For  all  A  needs  C  in  D  and  A  happens-at  T  in  F,  F  satisfies  C  at  T. 

-  For  all  L  whenever  C  in  F,  and  time-points  T,  if  H  satisfies  C  at  T 
then  H  satisfies  {L}  at  T. 

A  domain  F  entails  (written  |=)  the  t-proposition  F  holds -at  T 
(-iG  holds-at  F,  resp.),  iff  for  every  model  H  of  F,  F(F,  F)  —  true  {H{G,  F)  = 
false,  resp.). 

The  first  three  conditions  for  a  model  encapsulate  a  notion  of  default  persis¬ 
tence  for  fluents  whereas  the  fourth  condition  imposes  other  constraints  on  the 
model  from  explicit  information  about  the  fluents  given  in  F.  This  separation 
allows  a  modular  extension  of  the  language  and,  as  we  will  see,  facilitates  the 
development  of  a  proof  theory  for  it. 
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Example  1.  (Bulb  Domain:  Db) 

SwitchOn  initiates  Light  when  {Normal}  (Db^) 
SwitchOff  terminates  Light  (Db2) 

Break  terminates  Normal  {DbB) 

Light  whenever  {->Normal}  i^b^) 

SwitchOn  needs  {->Light}  {Db^) 

SwitchOn  happens-at  2  (Db6) 

Normal  holds-at  0  (^6^) 


In  this  example,  Db  entails  Light  holds-at  4  but  not  when  Db  is  extended  with 
Break  happens-at  3. 

The  above  semantics  assumes  that  no  events  occur  other  than  those  explicitly 
given  in  the  domain  description  D.  This  is  not  always  a  valid  assumption  and  it  is 
possible  to  have  domains  where  some  action  t3^es  are  open,  e.g.  in  the  example 
above  Break  could  be  considered  as  open.  Following  work  on  abduction  [9],  we 
define  a  notion  of  generalized  model  of  D  as  any  model  of  D  U  Ab,  where  Ab 
is  any  set  of  h-propositions  over  the  open  action  types  in  D.  A  corresponding 
ent ailment  is  then  defined  in  terms  of  these  generalized  models. 


3  An  Argumentation  Proof  Theory  for  £ 

The  basic  subset  of  the  language  5,  comprising  only  of  h-  and  c-propositions, 
has  been  re-formulated  into  the  argumentation  framework  of  LPwNF  in  [11].  In 
this  section,  we  give  a  brief  review  of  the  argumentation  formulation  of  S  and 
show  how  it  can  be  extended  when  we  extend  the  syntax  of  the  basic  language  or 
when  we  allow  open  action  types  in  a  domain  description.  This  results  in  a  proof 
theory  for  8  which  in  turn  will  form  the  basis  of  the  5-RES  system  implementing 
the  language. 

The  argumentation  re- formulation  of  8  translates  a  domain  D,  over  the 
basic  subset  of  the  language,  into  an  argumentation  program  PeiD)  = 
{B{D)yA£^<£)  in  LPwNF.  The  background  mono  tonic  logic  (£,!-)  of  the  LP¬ 
wNF  framework  is: 


C  consists  of  all  sentences  Ao Ai, . . . ,  An  (n  >  0),  with  A^,  0  <  i  <  n, 
positive  or  negative  (via  a  negation  or  complement  operator,  ->,)  literals, 
and  all  variables  implicitly  universally  quantified  from  the  outside,  and 
1“  is  obtained  by  repeatedly  applying  the  classical  modus  ponens  inference 


rule 


Y 
X 


with  X  <—Y  any  ground  instance  of  a  sentence  in  C. 


Given  D,  B(D),  called  the  background  theory  for  D,  is  given  by: 


—  If  A  happens-at  T  G  then  Happens{A^T)  €  B{D). 

-  If  A  initiates  F  when  {Li,...,Tn}  e  D,  then  B{D)  contains  a  rule  for 
Initiation: 
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Initiation{F,t)<r~Happens{A,  t), Holds At{Lx,t),, . . , Holds At{Ln,t). 

Similarly,  for  "terminates”  c-propositions  a  rule  for  Termination  is  given  in 
B[D).  (Here  and  below  HoldsAt{-^Fi,t)  stands  for  ~^HoldsAt{Fi,t)), 

The  rest  of  Ps{D)  is  independent  of  any  given  domain  D.  Ag,  called  the  argu¬ 
mentation  theory  consists  of: 

Generation  rules: 

HoldsAtif,  is)  ^  Initiationif,  h),ti^t2  (PC?[/,  t2;  h]) 

~^HoldsAt{f,t2)<^Termination{f,ti),ti^t2  {NG[f,t2;ti]) 

Persistence  rules: 

HoldsAtif,  t2)^HoldsAt{f,  -<i2  (PP[/,  t2\ ti]) 

-^HoldsAt{f,  t2)^'^HoldsAt{f,  ti),  tx-<t2  {NP[f,  t2\ h]) 

Assumption  rules: 

Holds  At{f,  t) 

-^HoldsAt(f,  t) 

Also,  <5  is  a  priority  relation  defined  over  Ag  by: 

NP[f,  t;  h]  <g  PGlf,  t;  f2]  iff  ^2, 

NG[f,t,ti]  <s  PG[f,t-,t2]  iSti^  t2, 

PA[f,t]  <£  NG[f,t-,t'],  PA[f,t]  <s  NP[f,f,t'], 

together  with  the  corresponding  cases  where  positive  rules  are  replaced  by  neg- 
ative  rules  and  vice  versa. 

The  essential  element  of  this  translation  is  that  it  formalizes  that  the  effects 
of  later  events  take  priority  over  the  effects  of  earlier  events.  The  argumentation 
semantics  of  Pg{D)  is  given  via  the  admissible  extensions  of  B{D).  These  are 
subsets,  5*,  of  argument  rules  from  Ag' C.  Ag  consisting  only  of  generation  or 
assumption  rules,  which  are  added  to  B(D).  An  extension  S  is  admissible  iff: 

—  it  is  consistent  i.e.  non-self- attacking,  and 

-  (counter-) attacks  any  set  of  arguments  attacking  it. 

A  set  of  argument  rules.  A,  attacks  another  such  set,  B,  if  the  two  sets  are 
in  conflict,  by  monotonicaly  deriving  (in  h),  together  with  B{D),  complimentary 
literals  A  and  -lA,  respectively,  and  A  is  not  of  lower  priority  than  B.  A  set  A 
is  of  lower  priority  than  B  if  it  has  a  rule  of  lower  priority  than  some  rule  in  B 
and  does  not  contain  any  rule  of  higher  priority  than  some  rule  in  B. 

Given  this  translation  it  can  be  shown,  under  some  quite  general  restrictions 
on  Z),  that  the  models  of  D  correspond  exactly  to  the  maximally  (w.r.t.  set 
inclusion)  admissible  extensions  of  Pg{D).  We  can  then  use  this  translation  to 
develop  an  argumentation-based  proof  theory  for  S.  This  proof  theory  is  de¬ 
fined  in  terms  of  derivations  of  trees,  whose  nodes  are  sets  of  arguments  in  Ag 
attacking  the  arguments  in  their  parent  nodes. 


{PAim 
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Let  So  be  a  (non-self-attacking)  set  of  arguments  in  Ae  such  that  B{D)  U 
So  b  {-~^)HoldsAt{F,T).  Then,  two  kinds  of  derivations  are  defined: 

-  Successful  derivations  5o,...,5,  building,  from  a  tree  consisting  only  of  the 
root  5o,  a  tree  whose  root  S  is  an  admissible  subset  of  As'  such  that  S  ^  So- 

-  Finitely  failed  derivations^  guaranteeing  the  absence  of  any  admissible  set  of 
arguments  containing  So- 

Then,  the  given  literal,  L  =  {-A)HoldsAt{F^T)^  we  say  that  L  is  a  sceptical 
consequence  of  D  iff  (i)  there  exists  a  successful  derivation  starting  with  So  and, 
(ii)  for  every  set  Sq  of  argument  rules  in  As  such  that  B{D)  U5o  derives  (in  h) 
the  complement  of  L,  every  derivation  for  Sq  is  finitely  failed.  If  only  the  first 
condition,  (i),  holds  we  say  that  L  is  a  credulous  consequence  of  D. 

The  formal  details  of  the  derivations  are  not  needed  for  this  paper.  Infor¬ 
mally,  both  kinds  of  derivation  incrementally  consider  all  attacks  against  the 
root  and,  whenever  this  does  not  counter-attack  one  of  its  attacks,  a  new  set  of 
arguments  that  can  do  this  is  added  to  the  root.  Then,  the  process  is  repeated, 
until  every  attack  has  been  counter-attacked  successfully  (successful  derivation) 
by  the  extended  root  or  until  some  attack  cannot  be  counter-attacked  by  any 
extension  of  the  root  (finitely  failed  derivations) .  Examples  of  this  proof  theory 
will  be  presented  in  the  next  section. 

An  important  feature  of  the  argumentation  re-formulation  of  E  is  the  fact 
that  this  is  modular  with  respect  to  the  addition  of  new  type  of  sentences  in 
the  language.  This  follows  primarily  from  the  fact  that  the  translation  is  faith¬ 
ful  at  the  level  of  the  models  of  the  language  and  so  it  can  reflect  the  modular 
separation  of  the  model  definition  into  two  parts:  conditions  (1-3)  encapsulat¬ 
ing  default  persistence  and  condition  4  for  extra  constrednts.  When  we  add  r- 
propositions  we  only  need  to  extend  the  background  definitions  of  Initiation 
and  Termination  in  B(D)  without  changing  the  type  of  arguments  in  Ps{B). 
For  each,  L  whenever  C  a  fact  Whenever{LjC)^  is  added  to  B{D),  and  the 
definitions  of  Initiation  and  Termination  are  augmented  with: 

Start{l,t)<— Whenever  {I  jC)^Select{cJl^  . . .  Jn})^ 

Start{ll,  t),  HoldsAt{l2i  i+),. . .  ,HoldsAt{ln^  t+) 
where  for  a  positive  literal  I  =  F,  Start{l^t),  is  to  be  read  as  Initiation{F^t) 
and  for  a  negative  literal,  I  =  ~->F,  as  Termination{F,t)^  and  t+  is  the  next 
immediate  time  after  t.  Hence  every  event  that  brings  about  any  literal,  of  C 
while  the  rest  of  this,  {^2?  •  •  •  ?  ^n},  continues  to  hold  also  brings  about,  through 
the  r-proposition,  L. 

In  turn,  the  only  extension  required  to  the  proof  theory  is  to  add,  for  any 
r-proposition  L  whenever  {Li,...,Lfc},  to  the  root  S  of  any  derivation  a  set 
of  arguments,  so  that  that  B{D)  U  5  U  5r  h  (j)  where  0  is  the  (classical) 
formula  HoldsAt{L^t)  HoldsAt{L\^t)^. . .  ^HoldsAt{Lk^t).  Similarly,  when  we 
extend  the  language  with  t  and  p-propositions  we  need  to  extend  the  proof 
theory  by  adding  to  the  root  of  a  derivation  a  set  of  arguments,  St  and  5p,  so 
that  they  can  derive  (with  B{D))  the  Holds  At  literals  corresponding  to  these 
sentences.  The  proof  theory  continues  then  as  before  but  now  with  the  extra 
attacks  against  S^St  and  Sp  to  be  considered. 
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Finally,  when  we  have  open  action  types  in  the  domain  D  the  proof  theory 
is  extended  to  allow  the  abduction  of  a  new  set  of  events,  iJ,  and  hence  new 
generation  arguments  can  be  added  to  the  root.  Derivations  are  now  defined  in 
terms  of  tuples,  <  H^S  >.  In  this  extended  proof  theory  it  is  possible  for  new 
attacks  to  be  generated  during  the  derivation  due  to  the  new  events  abduced. 
The  proof  theory  therefore  now  includes  suspended  attacks  which  can  become 
actual  attacks  when  H  grows.  If  this  happens  then  these  attacks  need  to  be 
counter-attacked  as  usual,  otherwise,  suspended  attacks  that  remain  so  until  the 
end  of  the  derivation  are  ignored. 

Theorem  1,  (Soundness  and  Completeness  of  the  Extended  Proof  Theory) 

Let  be  a  description  domain  possibly  with  open  action  types  and  So  a  con¬ 
sistent  set  of  arguments.  If  there  exists  a  successful  extended  derivation  from 
<  0,  *So  >  to  <  H,S  >  then  there  exists  a  generalized  model,  M,  of  D  such 
that  (i)  M  is  also  a  model  of  DUH  and  (ii)  M  satisfies  L  for  any  literal  L  s.t. 
B{D  U  H)U  S  \-  L.  Also,  if  every  extended  derivation  from  <  0,5o  >  is  finitely 
failed,  then  there  exists  no  generalized  model,  M,  of  D  such  that  M  satisfies  L 
for  every  literal  L  s.t.  B{D\J  H)  U  b  T  where  H  is  the  set  of  h-propositions 
corresponding  to  M. 

Conversely,  let  M  be  a  generalized  model  of  D  such  that  its  corresponding  set 
of  h-propositions  H  is  finite  and  M  satisfies  L.  Then  there  exists  a  set  of  argu¬ 
ments  Sq  and  a  successful  extended  derivation  from  <  0, 5o  >  to  <  H',S  >  such 
that  B{D  U  F')  U  5o  b  L  and  H'  C  H. 

4  Reasoning  with  the  Language  £ 

The  language  £  can  support  in  a  natural  way  a  variety  of  modes  of  reasoning  with 
actions  and  observations.  The  argumentation-based  computational  model  for  E, 
described  in  the  previous  section,  allows  a  principled  implementation  of  these 
forms  of  reasoning.  In  this  section  we  present  some  of  these  forms  of  reasoning 
and  explain  briefly  how  they  are  mapped  into  argumentation. 


4.1  Default  Persistence 

The  argumentation  translation  of  the  language  £  maps  the  basic  reasoning  of 
default  persistence  captured  by  the  model  theoretic  semantics  of  £  into  an  argu¬ 
mentation  reasoning.  Consider  the  following  example  where  vaccine  A  provides 
protection  only  for  people  with  blood  type  O,  and  vaccine  B  for  people  with 
blood  type  other  than  O. 


^  All  results  in  this  paper  refer  to  domains  with  discrete  linear  time,  a  finite  number 
of  h-propositions  and  a  restriction  that  limits  the  possibility  for  events  to  simulta¬ 
neously  initiate  and  terminate  the  same  fluent. 
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Example  2.  (Vaccinations  -  No  open  actions) 

InjectA  initiates  Protected  when  {T)jpeO}  {Dyl) 

InjectB  initiates  Protected  when  {-^TypeO}  {Dv2) 

InjectA  happens -at  2  (^v3) 

InjectB  happens-at  3  (DyA) 


Given  this  domain,  we  can  show  that  at  any  time  T/  after  3  G  =  {Protected 
holds -at  Tf}  is  a  sceptical  consequence  of  Dy  whereas  for  times  less  or  equal 
to  3  Protected  is  only  a  credulous  consequence.  An  argument  Sq  for  G  is  given 
by  So  =  {PG[Protected,Tf\2],PA[TypeQ,2]}  i.e.  a  generation  argument  based 
on  the  event  of  InjectA  at  time  2  together  with  an  assumption  argument  for 
TypeO  at  time  2.  All  attacking  arguments  against  this  can  be  counter-attacked 
(defended)  by  5o  itself.  This  gives  a  successful  derivation  for  5o  and  thus  G  is 
a  credulous  consequence.  To  show  that  it  is  a  sceptical  consequence  we  consider 
the  opposite  goal  -iG. 

The  only  way  to  derive  this  is  through  the  argument  Ri  =  {N A[Protected, 
Tf]}  (there  are  no  generation  arguments  for  -iG).  This  is  attacked  by  So  given 
above,  which  can  be  counterattacked  (only)  if  Ri  is  extended  to  R2  with  the 
assumption  argument  iV  A  [TypeO,  2].  But  Ri  and  thus  also  R2  are  also  attacked 
by  {PG[Protected,  Tf;  3],  NAlTypeO^  3]}  via  the  event  InjectB  at  time  3.  To  de¬ 
fend  against  this  it  is  now  necessary  to  add  PA[TypeO,3]  to  R2  to  give  R3  = 
{NA[Protected,Tf],  N A[TypeO,2],  PA[TypeO,S]}.  But  then  we  have  a  new  at¬ 
tack  against  R3  given  by  {NP[TypeO,  3;  2],  NA[TypeO,  2]}  through  a  persistence 
argument  from  time  2  to  time  3.  This  attack  can  only  be  counterattacked  via  a 
generation  argument  for  Holds  At  {TypeO^  3).  But  no  such  arguments  exist  in  Dy 
and  hence  the  derivation  for  -iG  finitely  fails,  as  required. 

This  example  shows  how  the  argumentation  reasoning  deals  correctly  with 
default  persistence  under  incomplete  information.  For  a  more  complex  example 
consider  the  same  goal  G  in  the  domain  below  where  the  fluents  TypeO  and 
Strong  are  incompletly  specified. 

Example  3.  (Vaccinations  Cnt) 

InjectB  initiates  Protected  when  {-> TypeO}  {Drl) 

InjectC  initiates  Protected  when  {Strong}  {Dr2) 

Strong  whenever  {TypeO}  (T>r3) 

InjectB  happens-at  2  [DrA) 

InjectC  happens-at  3  (T>r5) 

As  above,  derivations  for  -iG  finitely  fail.  The  two  attacks  against  “iG,  via 
the  two  injection  events,  can  only  be  counterattacked  by  {PA [TypeO,  2]}  and 
{NA[Strong^3]}.  But  then  the  satisfaction  of  the  ramification  statement  re¬ 
quires  {PA[Strong,  2]}  to  be  added.  In  turn  this  gives  a  new  persistence  attack 
of  Strong  from  time  2  to  3  which  can  not  be  counterattacked  as  there  is  no 
generation  rule  for  -^Strong. 
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4.2  Assimilating  Observations  and  Diagnosis 

A  domain  description  in  the  language  S  may  contain  observations  (t-proposi- 
tions)  about  some  of  its  fluents.  The  observations  can  refer  either  to  some  initial 
time  or  any  other  time  point.  An  argument  for  a  default  conclusion  to  be  valid 
must  also  be  extensible  to  an  admissible  superset  that  is  able  to  confirm  these 
observations.  This  extra  requirement  gives  a  form  of  reasoning  from  effects  to 
causes  both  forward  and  backward  in  time. 

Example  Infections  -  No  open  actions 

Expose  initiates  Infected  when  {TypeA}  (Al) 

Expose  initiates  Infected  when  {TypeB}  (A 2) 


Allergic  whenever  {  TypeA,  Infected}  (DiSj 

Allergic  whenever  {TypeB}  (A4) 

Expose  happens -at  3  (A5) 

-^Infected  holds-at  1  (Ab) 

Infected  holds-at  6  (A7) 


The  observation  at  time  6  requires  that  a  generation  rule  argument  for  Infected 
at  6  is  added  to  the  root  of  any  derivation.  The  weaker  assumption  argument 
PA[Infected]  6]  cannot  defend  against  its  persistence  attack  starting  from  time 
1  where  the  observation  of  -^Infected  is  given.  The  only  possibility  for  such  a 
generation  rule  is  the  one  based  on  the  event  of  Expose  at  time  3  with  either 
TypeA  or  TypeB  assumed  at  time  3,  and  consequently  at  any  other  time  before 
or  after  3.  Under  any  one  of  these  assumptions  the  two  r-propositions  imply 
that  Allergic  would  hold  from  6.  The  argumentation  reasoning  is  thus  able  to 
derive  that  ^Allergic  cannot  be  derived  credulously  and  hence  that  Allergic  holds 
sceptically  from  6  onwards. 

Effectively,  these  observations  are  explained  in  terms  of  missing  information  on 
incomplete  fluents.  When  a  domain  contains  open  action  types  this  gives  us  a 
form  of  diagnosis  of  the  observations  in  terms  of  assumptions  both  on  incomplete 
fluents  and  on  unknown  (in  D)  events. 

Definition  2.  (Diagnosis  in  8) 

Let  D  be  a  given  domain  description^  and  O  a  set  of  observations.  A  (strong) 
diagnosis  for  O  in  D  is  a  set  H  of  h-propositions  s.t.  D\J  H  is  consistent  and 
DUH\=0. 

A  weaker  form  of  diagnosis  useful  when  we  have  incomplete  information  on 
fluents  whose  truth  cannot  be  affected  by  any  action  (e.g  at  some  initial  time 
point),  is  £is  follows. 

Definition  3.  ( Conditional  Diagnosis  in  8) 

Let  D  be  a  domain  description  and  O  a  set  of  observations.  Then  a  weak  di¬ 
agnosis  for  O  in  D  is  a  set  H  of  h-propositions  s.t.  there  exists  a  model  M  of 

^  For  simplicity  of  presentation  we  will  assume  that  the  domain  does  not  contain 
any  p-propositions,  P .  If  this  is  the  case  then  we  have  an  extra  requirement  on  the 
diagnosis  that  D\J  H  \=  P. 
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DU  H  where  M  \=  O.  H  is  conditional  on  a  set  of  assumptions  A  iff  A  is  a 
set  of  t-propositions  (AnO  =  Hi)  such  that  H  is  a  (strong)  diagnosis  for  O  in 
DU  A,  The  tuple,  <  H,  A  >,  is  called  a  conditional  diagnosis  for  O  in  D. 

Note  that  the  assumptions  in  a  conditional  plan  can  refer  to  any  time 
point  not  necessarily  to  an  initial  time  point  only.  In  the  previous  example  the 
empty  set  =  0  is  a  weak  diagnosis  for  the  observation  Infected  holds -at  6 
in  the  domain  A  given  by  the  sentences  DA-Dib.  Two  conditional  diagnoses 
are  <  0,  TypeA  holds-at  3  >  and  <  0,  TypeB  holds-at  3  >.  Note  that  <  0, 
Type  A  holds-at  1  >  is  also  a  conditional  diagnosis.  The  assumption  that  TypeA 
holds  at  some  time  point  (e.g.  an  initial  time  point  1)  implies  that  it  also  holds 
at  any  other  time  point  as  no  action  in  Di  can  affect  the  value  of  this  fluent. 
Typically,  if  we  have  incomplete  information  on  fluents  that  cannot  be  affected 
by  any  action  or  the  information  is  incomplete  at  some  initial  time  point  before 
which  actions  can  not  occur  then  a  conditional  diagnosis  is  appropriate. 

Theorem  2.  Let  D  be  a  given  domain  description  and  O  a  set  of  observations. 
Let  also  So  C  As  be  a  set  of  arguments  and  Hq  a  set  of  action  facts  such  that 
B{DU  Ho)  U  5o  b  O.  If  there  exists  a  successful  extended  derivation  in  D  from 

<  Hq,So  >  to  <  H,S  >  then<  H,A  >  is  a  conditional  diagnosis  for  O  in  D, 
where  A  =  {F  holds-at  T\PA[F,T]  G  5}  U  {-F  holds-at  T\NA[F,T]  G  S}. 

To  illustrate  this  computation  of  conditional  diagnoses  let  us  consider  again 
example  1  where  Dh7  is  absent  and  Break  is  an  open  action  type.  Suppose  we 
are  given  the  observations:  Light  holds-at  4  and  -iLight  holds-at  6. 

To  assimilate  the  first  observation  we  can  use  a  generation  argument,  5*0  = 
{PG[Light, 4;2], PA[Normal, 2]},  based  on  the  given  event  of  SwitchOn.  This 
can  defend  itself  against  all  its  attacks  except  possibly  an  attack  via  a  generation 
argument  based  on  an  event  of  Break  at  a  time  after  2  and  before  4.  As  we  have 
no  such  event  in  our  computed  diagnosis  this  remains  suspended.  Also,  because 
the  p-proposition  in  D  requires  that  Light  holds  at  2,  So  will  be  extended 
to  Si  with  NA[Light,  2].  To  assimilate  the  second  observation  the  only  way  we 
can  extend,  5i,  is  via  a  generation  argument  based  on  an  event  of  Break  at  a 
time  before  6.  Hence  Si  is  extended  to  S2  =  Si  U  {NG[Light,4;T]}  and  Hq 
to  Hi  =  {Break  happens-at  T}  for  T  <  6.  Note  that  this  generation  of  ->Light 
is  indirect  through  the  ramification  statement  in  the  theory. 

Adding  this  new  event  results  in  the  re-examination  of  the  suspended  attack 
from  before.  In  general,  there  are  two  ways  to  deal  with  this  situation.  One  way 
is  to  constrain  the  time  of  the  new  event  so  that  it  does  not  lead  to  an  actual 
attack.  The  other  is  to  counter-attack  this  attack  in  the  usual  way.  In  this  case, 
the  second  option  is  not  available  as  we  cannot  assume  SwitchOn  events.  Hence 
we  are  forced  to  set  T  >  4.  The  computation  then  concludes  successfully  with 

<  Hi,  S2  >  giving  the  (a  set  of)  conditional  diagnosis  (one  for  each  T  in  [4,6)) 

<  {Break  happens-at  T},  {Normal  holds-at  2,-^Light  holds-at  2}  >. 

A  computed  conditional  diagnosis  <  H,  A  >  in  D  can  be  tested  to  see  if  this 
is  a  strong  diagnosis  by  checking  whether  the  assumptions  A  follow  sceptically 
from  Du  H.  In  the  previous  example,  {Normal  holds-at  2}  is  not  a  sceptical 
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consequence  (the  domain  is  incomplete  on  this  fluent)  and  hence  the  diagnosis 
needs  this  condition. 


5  The  5-RES  System:  Implementing  S 

The  argumentation  based  proof  theory  described  in  section  3  forms  the  basis 
of  a  principled  implementation  of  the  language  E  into  a  system,  called  E-RES. 
The  computational  effectiveness  of  this  system  depends  on  two  main  factors: 
(a)  reducing  the  number  of  attacks  considered  for  the  goal  at  hand  by  restrict¬ 
ing  only  to  attacks  that  are  necessary,  and  (b)  improving  the  effeciency  of  the 
satisfaction  of  the  global  constraints  imposed  by  the  t,p  and  r-propositions.  A 
major  optimization  that  we  can  apply  with  respect  to  the  first  factor  concerns 
the  consideration  of  persistence  attacks. 

Definition  4.  A  restricted  attack  against  a  set  S  is  a  minimal  attack  on  S 
which  does  not  contain  any  persistence  rule  PP[F,T';T]  (resp.  NP[F^T';T]) 
unless  S  contains  the  assumption  rule  NA[F,T]  (resp.  PA[F,T])  and  B{D)  U 
S  h  Holds At{F,T')  (resp.B{D)US  h  -^HoldsAt{F,T'). 

Lemma  1.  Let  D  be  a  domain  and  S  a  set  of  argument  rules  that  is  consistent 
and  attacks  all  the  restricted  attacks  against  it.  Then  there  exists  a  superset  of  S 
which  is  admissible. 

This  means  that  we  only  need  to  consider  those  persistence  attacks  against  a 
set  S  that  start  from  assumptions  that  are  in  S.  In  the  implementation  of  5- RES 
we  exploit  this  lemma  by  considering  a  notion  of  suspended  persistence  attacks 
on  an  assumption  which  are  activated  whenever  the  contrary  assumption  (at 
another  time  point)  is  added  to  S. 


5.1  Satisfiability  of  Constraints  in  5-RES 

The  global  constraints  imposed  by  the  t,p  and  r-propositions  can  be  compu¬ 
tationally  demanding.  Although  most  of  these  constraints  refer  to  a  single  time 
point,  those  imposed  by  the  ramification  statements  need  to  be  satisfied  at  every 
time  point  and  hence  could  be  a  major  source  of  inefficiency.  The  lemma  below 
allows  us  to  address  this  by  confining  this  task  to  a  specific  set  of  time  points. 

Lemma  2.  Let  D  be  a  domain  and  Ti,T2  (Ti  <  T2)  be  time  points  such  that 
there  is  no  h-proposition  in  D  at  any  time  T  in  {Ti,T2).  Suppose  also  that 
there  exists  a  partial  model  Mp  of  D  defined  over  the  whole  time  line  minus  the 
interval  (Ti,r2],  except  at  times  points  in  (Ti,T2]  where  t-propositions  are  given 
in  D  where  Mp  satisfies  the  conditions  imposed  by  these.  Mp  also  satisfies  any 
p-propositions  at  T2.  Then  if  there  exists  a  time  point  T  in  (ri,T2]  such  that  Mp 
can  be  extended  to  a  partial  model  covering  also  T  then  Mp  can  be  extended  to 
a  full  model  of  D. 
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Hence  when  D  contains  only  a  finite  number  of  h-propositions  we  can  split 
the  (linear)  time  line  to  a  finite  number  of  time  intervals  and  satisfy  the  ram¬ 
ification  constraints  only  at  one  time  point  in  each  of  these  intervals.  £-RES 
implements  an  interleaved  process  of  (a)  satisfiability  of  the  ramification  state¬ 
ments  as  classical  implications  at  these  time  points  and  (b)  cross-check  of  the 
assumptions  required  in  (a)  under  the  language  £  default  persistence. 

As  the  number  of  (ground)  ramification  constraints  at  each  time  point  can 
be  large  we  can  employ  a  SAT  solver  [6]  within  the  5-RES  system  to  carry  out 
this  process  (a)  of  generating  a  classical  model  for  these.  Furthermore,  we  have 
considered  a  notion  of  relevancy  of  ramifications  to  the  query  at  hand  which, 
assuming  that  D  is  consistent,  selects  at  each  time  point  only  a  subset  of  rami¬ 
fication  constraints.  Therefore  we  now  have  an  iterative  (over  the  finite  number 
of  time  points)  process  of  interleaving  between:  (i)  projecting  the  assumptions, 
that  we  have  added  to  S  so  far,  to  the  current  time  point  and  selecting  the  rel¬ 
evant  ramification  constraints  based  on  these,  (ii)  generating  a  classical  model 
of  these  constraints  using  a  SAT  solver  given  the  partial  instatiation  generated 
in  (i),  and  (iii)  ensuring  the  compatibility  of  this  model  with  the  arguments  S 
computed  so  far  at  the  previous  time  points.  Note  that  the  output  of  steps  (ii) 
and  (iii)  could  affect  the  set  of  relevant  ramifications  computed  in  (i)  and  hence 
we  need  to  repeat  the  whole  process  before  going  to  the  next  time  point. 

Initial  experiments  with  this  iterative  method  indicate  a  significant  reduction 
in  the  computation.  Note  however  that  we  are  still  left  with  the  problem  of 
deciding  which  t-propositions  are  relevant  to  the  query /goal  at  hand.  Currently, 
we  assume  that  these  are  selected  externally  to  the  system. 

The  5-RES  system  is  currently  implemented  in  Prolog  (Eclipse  4.2).  An  inter¬ 
face  allows  the  user  to  define  directly  in  the  syntax  of  the  language  5  the  domain 
description.  The  system  also  supports  some  extra  forms  of  auxiliary  information, 
e.g.  that  a  fluent  is  constant  and  so  does  not  change  over  time.  Open  action  types 
are  specified  together  with  their  associated  p-propositions  and  priority  informa¬ 
tion  amongst  them  that  might  exist.  In  addition,  although  5  is  defined  as  a  propo¬ 
sitional  language  the  5-RES  system  allows  domain  descriptions  to  be  given  in  a 
non-propositional  form  under  some  restrictions.  An  early  version  of  the  system 
with  examples  is  available  from  http://www.ucl.ac.uk/  uczcrsm/LanguageE/. 
New  versions  of  the  system  will  be  added  to  this  web  site  in  the  near  future. 


6  Related  Work  and  Conclusions 

Recently  there  has  been  a  wide  interest  in  developing  specialized  action  lan¬ 
guages  [5].  These  efforts  have  concentrated  on  the  formal  semantics  of  such  lan¬ 
guages  and  how  they  can  be  applied  to  specific  problems.  Examples  of  these  are 
the  language  Golog  [12]  or  the  Fluent  Calculus  as  developed  in  [15]  for  cognitive 
robotics,  a  circumscriptive  Event  Calculus  [14]  and  the  language  C  [8]  for  plan¬ 
ning,  and  the  language  L  together  with  others  related  to  it  [1,4]  for  the  problem 
of  diagnosis.  Our  work  focuses  on  the  general  computational  aspects  of  such 
languages  using  argumentation  and  abduction  as  a  basis  for  a  principled  imple- 
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mentation  of  the  language  E,  A  system,  called  Causal  Calculator  [13],  which  is 
based  on  the  language  C  translates  the  whole  representation  into  a  propositional 
theory  and  then  uses  a  SAT  solver  to  find  a  solution  to  its  query.  A  systematic 
comparison  of  the  5-RES  system  with  these  systems  would  be  useful. 

An  interesting  feature  of  our  approach  is  the  possibility  it  opens  of  synthesiz¬ 
ing,  in  the  implementation  of  these  languages,  the  resolution  based  computation 
of  argumentation  and  abduction  in  Logic  Programming  with  the  propositional 
satisfiability  methods  of  SAT  solvers.  A  SAT-based  procedure  has  also  been  used 
recently  in  C  [7]  for  planning.  This  hybrid  computational  model,  that  could  also 
include  other  methods  e.g.  constraint  solving,  is  an  important  topic  of  future 
work.  Currently,  the  system  is  designed  with  emphasis  on  the  complexity  of 
reasoning  that  it  can  perform  rather  than  on  the  efficiency  of  large  scale  com¬ 
putation.  We  are  studying  ways  to  improve  this  by  investigating  further  notions 
of  relevancy  in  order  to  dynamically  focus  the  computation  only  on  the  parts  of 
the  theory,  especially  of  t-propositions,  that  are  needed  for  the  query  at  hand. 
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The  set  V(i)  of  variables  that  occur  in  a  term  is  defined  as  follows: 

0  ,  if  t  is  a  constant; 

{t}  ,  if  t  is  a  variable;  and  (3) 

Uiii  ^(^i)  »  if  t  is  a  function  /(ti, . . . ,  tm)  . 

A  variable  occurs  in  a  literal  if  it  occurs  in  at  least  one  of  its  arguments: 

V(a(ti,...,t„))  =  (J  V(ti)  .  (4) 

i=l 

An  inference  rule  R  is  of  the  form: 

h  <  lli • • • 1 1n  ?  (5) 

where  the  head  h  is  an  atom  and  In  in  the  body  are  literals.  The  sets 

of  positive  and  negative  literals  in  the  body  of  R  are  denoted  by  body'^  {R)  and 
body~{R)  respectively.  Intuitively,  a  rule  asserts  that  if  all  literals  in  the  rule 
body  are  true,  then  the  head  must  be  true  also.  A  logic  program  P  is  a  finite  set 
of  rules.  We  denote  the  set  of  predicate  symbols  that  occur  in  P  with  preds{P). 

The  set  of  variables  that  occur  in  a  rule  R  is  defined  in  terms  of  variables 
that  occur  in  its  literals: 

V{R)  =  V{head{R))  U  [J  V{1)  .  (6) 

lQ.hody{R) 

A  rule  is  ground  if  V(P)  =  0. 

Let  P  be  a  variable-free  logic  program  and  M  be  a  set  of  atoms  that  occur 
in  P.  Then,  the  Gelfond-Lifschitz  reduct  is  obtained  by: 

1.  removing  each  rule  with  a  negative  literal  not  A  in  its  body  where  A  £  M. 

2.  removing  all  negative  literals  form  the  bodies  of  the  remaining  rules. 

Since  P^  is  negation-free,  it  has  a  unique  least  model  M'.  If  the  model  M' 
coincides  with  Af ,  then  M  is  a  stable  model  of  P. 

The  Herbrand  universe  HU(P)  of  a  logic  program  P  is  the  set  of  constant 
terms  that  can  be  formed  using  the  constants  and  function  symbols  of  P.  A 
ground  instance  of  a  literal  or  a  rule  can  be  obtained  by  replacing  variables  in 
it  by  terms  in  HU(P).  The  Herbrand  instantiation  Pq  is  the  set  of  all  possible 
instantiations  of  rules  in  P.  The  set  of  stable  models  of  a  logic  program  P 
with  variables  is  defined  to  be  the  set  of  stable  models  of  its  instantiation  Pq-  In 
practice,  we  usually  do  not  have  to  construct  the  full  Herbrand  instantiation  to  be 
able  to  construct  all  stable  models.  Hereafter  we  will  use  the  term  instantiation 
of  P  to  mean  any  subset  of  Pq  that  has  the  same  set  of  stable  models. 

Example  1,  Let  P  be  the  program: 

o(l)  ;  a(2)  <- 

b{x)  <—  a(a:),not  c{x) 
c{x)  <—  o(x),not  b(x)  . 


V{t)  = 


(7) 
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Fig.  1.  The  dependency  graph  of  Program  (1) 


Then,  the  instantiation  Pq  is: 

a(l)  a(2) 

6(1)  •<—  a(l), not  c(l)  6(2)  ^  a(2),  not  c(2)  (8) 

c(l)  •(—  a(l),not  6(1)  c(2)  a(2),not  6(2)  . 

Now,  Pg  has  four  stable  models:  Mi  =  {a(l),  a(2),6(l),  6(2)},  M2  =  {a(l),a(2), 
6(1),  c(2)},  M3  =  {a(l),a(2),c(l),6(2)},  and  M4  -  {a(l),a(2),c(l),c(2)}.  Con¬ 
sider  Ml.  The  reduct  Pq^  is  the  program: 

6(1)^;  6(2)^;  a(l)  a(2)  ^  .  (9) 

The  least  model  of  P^^  =  {a(l),  a(2),  6(1),  6(2)}  =  Mi,  so  we  see  that  Mi  is 
really  a  stable  model  of  Pq  and  hence  of  P. 

A  spitting  set  of  a  normal  logic  program  P  is  a  set  of  ground  atoms  U  such 
that  for  all  rules  R  in  Pq  if  head{R)  is  in  P,  then  all  atoms  occurring  in  the 
rule  body  are  also  in  U.  We  denote  the  set  of  ground  rules  whose  heads  are  in  U 
with  buiPc)-  Given  an  an  evaluation  I  of  atoms  in  C/,  we  denote  by  eui^Pcyl) 
the  set  of  ground  rules  that  is  obtained  by  removing  from  Pq  all  rules  that  have 
a  literal  I  containing  an  atom  of  U  that  is  not  satisfied  by  /  and  removing  all 
literals  I  containing  a  member  of  U  from  the  bodies  of  the  remaining  rules.  By 
Splitting  Set  Theorem  [8] ,  M  is  a  stable  model  of  Pq  only  if  M  =  /  U  J  where  I 
is  a  stable  model  of  hulPc)  and  J  is  a  stable  model  of  eu{PG  \  &c/(P)j^)- 

3  Omega-Restricted  Logic  Programs 

In  this  section  we  give  a  formal  definition  for  a;-restricted  programs.  The  main 
idea  is  to  construct  a  stratification  of  the  predicate  symbols  such  that  a  pred¬ 
icate  p  is  on  a  higher  level  than  a  predicate  g  if  p  is  defined  in  terms  of  q.  We 
start  by  formalizing  the  concept  of  dependency  between  predicate  symbols. 

Definition  1.  Let  P  be  a  logic  program.  Then,  the  one-step  dependency  relation 
Di{P)  C  preds{P)  x  preds[P)  is  defined  as  follows: 
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Abstract.  We  define  a  new  syntactic  class  of  logic  programs,  omega- 
restricted  programs.  We  divide  the  predicate  symbols  of  a  logic  program 
into  two  parts:  domain  and  non-domain  predicates,  where  the  domain 
predicates  are  defined  by  the  maximal  stratifiable  subset  of  the  rules  of 
the  program.  We  extend  the  usual  definition  of  stratification  by  adding 
a  special  omega-stratum  that  holds  all  unstratifiable  predicates  of  the 
program.  We  demand  that  all  variables  that  occur  in  a  rule  also  occur 
in  the  rule  body  in  a  positive  literal  that  is  on  a  lower  stratum  than  rule 
head.  This  restriction  is  syntactic  and  can  be  checked  efficiently.  The 
existence  of  a  stable  model  of  an  omega-restricted  program  is  decidable 
even  when  function  symbols  are  allowed.  We  prove  that  the  problem  is  2- 
NEXP-complete  and  identify  subclasses  of  omega-restricted  programs 
such  that  the  problem  stays  in  NEXP  or  NP.  The  class  of  omega- 
restricted  programs  is  implemented  in  the  Smodels  system. 


1  Introduction 

The  answer  set  programming  (ASP)  paradigm  has  gained  popularity  in  the  re¬ 
cent  years  as  a  number  of  ASP  systems  have  become  available  (for  example, 
DeReS  [3],  dlv  [6],  and  Smodels  [12]).  The  basic  idea  of  ASP  is  to  encode  a 
problem  as  a  logic  program  such  that  the  answer  sets  (stable  models)  of  the  pro¬ 
gram  correspond  to  the  solutions  of  the  problem.  We  then  use  a  logic  program 
engine  to  find  the  answer  sets  of  the  program.  The  underlying  formal  semantics 
is  usually  based  on  some  extension  of  the  stable  model  semantics  of  normal  logic 
programs  [7]. 

The  inference  engines  of  the  existing  systems  work  with  ground  programs, 
that  is,  programs  without  variables.  A  rule  with  variables  represents  the  set  of 
ground  rules  that  can  be  created  by  replacing  the  variables  in  it  by  constant 
terms  that  occur  in  the  program.  This  instantiation  is  done  in  a  preprocessing 
step  before  the  actual  inference  engine  is  used.  This  bottom-up  approach  to 
variable  use  has  prevented  the  use  of  function  symbols  since  even  one  function 
symbol  in  a  program  forces  its  Herbrand  instantiation  to  be  infinite.  However, 
in  most  cases  it  is  enough  to  examine  only  a  small  subset  of  the  Herbrand 
instantiation  since  vast  majority  of  the  rules  will  have  unsat isfiable  bodies  so 
they  can  be  left  out  without  affecting  the  set  of  stable  models.  This  holds  true 


T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  267—279,  2001. 
(c)  Springer- Verlag  Berlin  Heidelberg  2001 


268  Tommi  Syrjanen 


even  when  function  symbols  are  allowed;  it  is  possible  that  all  answer  sets  of  a 
program  are  finite  and  computable  even  if  the  Herbrand  instantiation  is  infinite. 

The  aim  of  this  work  is  to  define  a  new  class  of  logic  programs,  a;-restricted 
programs,  that  are  syntactically  guaranteed  to  be  decidable  even  when  function 
symbols  are  used.  The  basic  idea  is  to  construct  a  hierarchy  of  predicates  such 
that  the  predicates  on  the  lowest  level  are  defined  using  only  ground  facts  and 
all  variables  that  occur  in  a  rule  of  level  n  +  1  have  to  also  occur  in  a  positive 
literal  of  level  n  or  lower  in  the  rule  body.  The  definition  of  the  hierarchy  extends 
the  usual  concept  of  stratification  [1]  by  adding  a  special  ^/-stratum  to  hold  the 
unstratifiable  part  of  a  program. 

It  turns  out  that  this  syntactic  restriction  is  strong  enough  to  guarantee  finite 
answer  sets.  In  fact,  we  will  see  that  deciding  whether  an  tj-restricted  program 
has  an  answer  set  is  2-NEXP-complete.  Since  the  stable  model  semantics  of 
logic  programs  without  functions  is  NEXP-complete  [4],  we  can  conclude  that 
by  using  cj-restricted  functions  we  move  up  one  step  in  the  exponential  hierarchy. 
This  result  also  implies  that  we  cannot  solve  all  computable  problems  using  un¬ 
restricted  programs.  Recently  P.  Bonatti  [2]  has  proposed  a  computationally 
complete  class  of  logic  programs  called  finitary  programs.  However,  together 
with  Turing  equivalence  comes  semi-decidability  of  general  reasoning  problems. 

The  on-restricted  programs  have  been  implemented  in  the  Smodels  sys¬ 
tem  [12]  that  has  been  designed  in  Helsinki  University  of  Technology.  The  Smod¬ 
els  system  is  available  at  http://www.tcs.hut.fi/Software/smodels. 

In  the  following  sections  we  will  use  the  following  program  to  illustrate  the 
basic  concepts  of  on-restriction: 

number{0)  <r-  ;  • . .  ;  numher{Ti)  <— 
odd(x  -\-l)  number {x)^  even{x) 

even{x  +  1)  4—  number{x),  odd(x) 
einen(O)  4- 

two-divides{x)  4—  even{x) 
interesting (x)  4—  number(x), not  dull{x) 

dull{x)  4—  number (x),  not  inter esting{x) 
inter esting.odd{x)  4-  odd{x),  inter esting{x)  . 

2  The  Stable  Model  Semantics 

The  basic  component  of  a  logic  program  is  an  atom  of  the  form: 

,  ■  .  .  ,  ^n)  (2) 

where  p  is  a  n-ary  predicate  symbol  (n  >  0)  and  ii,  . . .,  are  terms.  A  term 
is  either  a  variable  in,  a  constant  c,  or  an  m-ary  function  symbol  /(ti, . . .  ,tm) 
where  ti,  . . .,  tm  are  terms.  We  denote  the  predicate  symbol  of  an  atom  A  by 
pred{A).  A  literal  is  either  an  atom  A  or  its  negation  not  A, 
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1.  D^{P)  =  {{pred{a),  pred{l))  \  3R  e  P  :  a  =  head{R)  A  I  G  body'^{R)} 

2.  D^{P)  =  {{pred{a) ^  pred{l))  \  3R  £  P  :  a  =  head{R)  A  I  G  hody~  {R)} 

3.  D^{P)  =  Dt{P)UD^{P), 

The  one-step  dependency  relation  may  be  drawn  as  a  graph.  For  example,  the 
dependency  graph  of  Program  (1)  is  shown  in  Figure  1. 

We  now  generalize  the  one-step  dependency  relation  to  a  full  dependency 
relation.  The  intuition  is  that  a  predicate  p  depends  on  a  predicate  q  if  there  is  a 
path  from  p  to  g  in  the  dependency  graph.  If  at  least  one  of  the  edges  between  p 
and  q  is  negative,  then  p  depends  negatively  on  q. 

Definition  2.  A  dependency  path  np  of  a  logic  program  P  is  a  sequence 

T^P  =  {Pl,P2,---,Pn)  (10) 

where  pi  G  preds{P)  for  1  <  i  <  n  and  {pj,pj-^i)  G  Di{P)  for  1  <  j  <  n.  A 
path  TTp  is  negative  (denoted  by  Ifp)  if  and  only  if  {Pj,Pj+i)  G  Df  (P)  for  some 
I  <  j  <n. 

Definition  3.  The  dependency  relation  D{P)  C  preds{P)  x  preds{P)  of  a  logic 
program  P  is  defined  as  follows: 

1.  D+(P}  =  {(p, q)\3Trp-.Trp  =  {p,..., q)}; 

2.  D-(P)  =  {(p, g)  I  Bttp  :iTp  =  (p,..., q)};  and 

3.  D{P)=D+{P)\JD-{P). 

Next,  we  define  the  concept  of  cj-stratification.  The  definition  extends  the 
traditional  definition  of  stratification  [1]  by  adding  a  new  stratum,  the  C4;-stratum, 
for  the  predicates  that  depend  negatively  on  each  other. 

Definition  4.  An  cj-stratification  of  a  program  P  is  a  function  S  :  preds{P) 

N  U  {a;}  such  that: 

1.  VpiVp2((pi,P2)  e  D+{P)  5(pi)  >  5(p2)),-  and 

2.  VpiVp2((pi,P2>  e  D-{P)  ^  <S(pi)  >  5(p2)  V<S(pi)  =  w)  . 

We  use  the  convention  that  a;  >  n  for  all  n  G  N.  The  first  condition  asserts  that 
a  predicate  pi  that  depends  positively  on  a  predicate  p2  has  to  be  on  at  least  as 
high  stratum  as  p2-  The  second  condition  states  that  if  pi  depends  negatively 
on  p2,  then  pi  has  to  be  on  a  higher  stratum  or  they  both  must  be  in  the  a;- 
stratum.  In  practice,  we  are  interested  in  stratifications  that  are  strict  in  the 
sense  that  «S(pi)  >  5(p2)  whenever  pi  depends  on  p2  but  not  vice  versa. 

Example  2.  Consider  Program  (1).  We  can  construct  an  a;-stratification  S  for  it 
by  looking  at  its  dependency  graph.  As  there  are  no  edges  leading  from  number^ 
we  can  set  S {number)  =  0.  Predicates  even  and  odd  depend  on  number  and  each 
other,  so  we  set  S{even)  =  S(odd)  =  1.  Continuing  this,  two-divides  depends 
on  even  so  S{two.divides)  =  2.  The  negative  cycle  of  interesting  and  odd  forces 
that  S {interesting)  =  S{odd)  =  S {interesting .odd)  =  uj.  This  stratification  is 
shown  in  Figure  2. 
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Fig.  2.  A  stratification  of  Program  (1) 


Next,  we  will  extend  the  cj-stratification  to  cover  also  rules  and  variables  by 
defining  the  concept  of  an  cj-valuation. 

Definition  5.  The  a;- valuation  of  a  rule  R  under  an  u -stratification  S  is  the 
function: 

f2{R,S)  —  S(pred{head{R))) 

The  u;- valuation  of  a  variable  v  in  a  rule  R  under  an  uj -stratification  S  is  the 
function: 

Q{v,  R,  S)  ~  min{{S{pred{a))  |  a  G  body'^(R)  Av  e  V(a)}  U  {a;}) 
Example  3.  Let  «S  be  as  defined  in  Example  2.  Consider  the  rule 
R  :  inter  esting{x)  •<—  number  (a;),  not  dull{x)  . 

Now 


Q{R^S)  =  S  {interesting)  =  cj 
f2(x,R,S)  =  inin{S(number)yS(dull),u;}  =  min{0,cj}  ~  0  . 

A  rule  is  cj-restricted  if  all  variables  that  occur  in  it  also  occur  in  a  positive 
body  literal  that  belongs  to  a  strictly  lower  stratum  than  the  head. 

Definition  6.  Let  R  be  a  rule  in  a  logic  program  P.  Then  R  is  w-restricted  if 
and  only  if  there  exists  an  uj -stratification  S  such  that: 

yv  G  V(i2)  :  Q{v,R,S)  <  f2{R,S)  . 

Definition  7 .  Let  P  be  an  logic  program.  Then  P  is  cj-restricted  if  and  only  if 
all  rules  R  £  P  are  uj -restricted. 
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Example  4-  Consider  the  rule: 


s{x  +  1)  5(a:) 

This  rule  is  not  cj-restricted  since  for  all  ^-stratifications,  0{R^S)  =  R^S). 

Finally,  we  divide  the  predicate  symbols  into  two  classes,  domain  predicates 
that  are  on  finite  strata  and  non-domain  predicates  that  are  on  the  a;-stratum. 
The  domain  predicates  are  defined  by  the  maximal  stratifiable  subset  of  the  rules 
of  the  program. 

Definition  8.  Let  P  be  an  uj -restricted  program.  Then  a  predicate  p  6  preds{P) 
is  a  domain  predicate  if  and  only  if  there  exists  an  to -stratification  S  such  that 
S{p)  <  uj.  The  set  of  rules  defining  domain  predicates  of  P  is  denoted  by  T)(P). 

4  Domain  Predicates  and  Instantiation 

The  subprogram  defining  the  domain  predicates  of  P  is  stratified,  so  it 

has  a  unique  least  model  Md(p).  It  is  easy  to  verify  that  Mi>(p)  is  a  splitting  set 
of  P.  Thus,  we  can  compute  the  stable  models  of  P  by  first  computing 
and  then  extending  it  to  cover  the  atoms  on  tj-stratum.  Moreover,  each  variable 
that  occurs  in  a  rule  R  occurs  also  in  a  positive  domain  literal  so  we  can  create 
all  relevant  ground  instances  of  R  by  computing  the  natural  join  of  extensions 
of  the  domain  literals  in  body{R). 

We  can  compute  and  the  instantiation  Png  of  Pn  —  P\ 'L>{P)  using 

the  following  algorithm: 

1.  Find  all  strongly  connected  components  of  the  dependency  graph  of  P.  Each 
component  becomes  a  new  stratum  with  the  exception  that  all  components 
that  have  a  path  to  a  negative  dependency  cycle  are  put  on  the  w-stratum. 
Order  the  different  strata  by  doing  a  depth-first  search  over  the  strongly 
connected  components. 

2.  Instantiate  the  predicates  on  finite  strata  starting  from  the  lowest  one.  After 
instantiation,  compute  the  deductive  closure  of  the  new  ground  rules  and 
store  the  resulting  atoms  as  facts  in  a  database.  These  facts  are  then  used 
to  give  domains  for  variables  when  we  instantiate  the  rules  on  higher  strata. 

3.  Finally,  instantiate  all  rules  on  the  cj-stratum  and  output  them  along  with 
the  domain  facts. 

5  Computational  Complexity 

In  this  section  we  examine  the  computational  complexity  of  a;-restricted  pro¬ 
grams.  We  are  interested  in  two  problems: 

-  In  INSTANTIATION  we  have  an  ^-restricted  program  P  and  a  ground  atom 
p(ti, . . .  ,tn)  and  we  want  to  find  whether  one  of  the  following  conditions 
holds: 
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Table  1.  Computational  complexity 


Instantiation 

Model 

No  variables 

— 

NP-complete 

Fixed  variables 

No  functions 
With  functions 

P-complete 

EXP-complete 

NP-complete 

NEXP-complete 

Unlimited  variables 

No  functions 
With  functions 

EXP-complete 

2-EXP-complete 

NEXP-complete 

2-NEXP-complete 

1.  G  Mx)(p);  or 

2.  There  is  a  rule  p{ti, . . . , ^n)  ^  ^i,  ■ . . ,  /n  in  Png- 

-  In  MODEL  we  want  to  find  out  whether  an  a;-restricted  program  P  has  an 
answer  set. 

The  instantiation  complexity  is  included  in  the  model  complexity  in  all  cases 
since  we  may  have  to  construct  the  full  instantiation  of  a  program  before  we 
know  whether  it  has  any  stable  models  at  all. 

In  addition  to  proving  complexity  results  for  the  whole  class  of  w-restricted 
programs,  we  examine  how  the  computational  complexities  of  instantiation 
and  MODEL  change  when  we  restrict  our  attention  to  some  subclasses  of  pro¬ 
grams.  We  use  two  parameters  to  divide  the  a;-restricted  programs  into  four 
classes: 

-  The  maximum  number  of  variables  in  a  rule  is  either  fixed  to  some  constant  d 
or  it  is  unlimited;  and 

-  Function  symbols  are  either  allowed  or  not. 

The  main  complexity  results  are  presented  in  Table  1.  The  model  complexi¬ 
ties  of  function-free  normal  logic  programs  with  the  stable  model  semantics  have 
been  presented  in  earlier  literature  [10,4].  The  corresponding  complexity  classes 
of  function-free  cj-restricted  programs  are  the  same  so  we  see  that  at  least  in 
these  categories  u;-restricted  programs  are  as  expressive  as  normal  logic  pro¬ 
grams.  Since  the  model  problem  of  the  unrestricted  case  is  2-NEXP-complete, 
we  know  that  ^-restricted  programs  are  decidable: 

Theorem  1.  Both  INSTANTIATION  and  MODEL  are  decidable  for  lj -restricted 
programs. 


5.1  Turing  Machine  Translation 

Most  of  the  complexity  results  of  this  work  are  derived  by  proving  that  the 
computations  of  a  deterministic  Turing  machine  M  can  be  simulated  by  a  logic 
program  P  such  that  the  size  of  P  is  polynomial  with  respect  to  the  size  of  M. 

Definition  9.  A  deterministic  Turing  machine  M  ~  {K,  S,  6,  s)  where  K  is  a 
finite  set  of  states,  S  is  a  finite  alphabet  containing  the  blank  symbol  U,  s  ^  K 
is  the  initial  state  and  5  is  a  transition  function  S  :  K  x  U  {K\J{y,n})  x  U  x 
{-1,0,1}. 
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A  computation  of  a  Turing  machine  M  given  an  input  x  starts  from  the 
configuration  (s,  Ua;)  and  each  computation  step  yields  a  new  configuration  ac¬ 
cording  to  5  until  one  of  the  halting  states  y  (accept)  or  n  (reject)  is  reached. 

We  encode  the  states  of  a  Turing  machine  M  using  the  predicate  state(q)j  the 
alphabet  using  symbol  (a),  and  the  transitions  using  transition(qi^ai,q2^cr2,d), 
where  d  G  {-1,0,+!}.  The  atom  at-place(a,p,t)  is  used  to  denote  that  the 
input  tape  cell  p  contains  the  symbol  a  at  the  time  step  t.  The  predicate 
current- state(q,  p,  cr,  t)  indicates  that  the  machine  is  in  the  state  q  and  the  head 
is  over  the  tape  cell  p  looking  at  the  symbol  a  at  the  time  step  t. 

We  encode  one  computation  step  using  the  two  rules: 

at-place(s2,p,t  1)  ^  transition(qi,Si,q2yS2,d), 

current-state(qi,p,si^t),  (11) 

place (p)^  time(t) 

current- state(q2iP  +  d,  S3,t  +  1)  ^  transition(qi^si,q2, 52,  d), 

current-state(qi,p,si,t)j  (12) 

at-place(sz^p  +  d,  t),  time(t), 
place (p),  symbol (ss)  • 

Here  we  have  used  the  notation  t  +  1  to  denote  the  successor  of  t.  How  the 
successor  relation  is  actually  defined  depends  on  the  program  class  that  we  want 
to  examine.  The  same  thing  holds  also  for  predecessor  relation  that  is  used  in 
the  case  of  p  -  1. 

The  rules  above  handle  the  cell  where  the  read /write-head  is  currently  posi¬ 
tioned.  In  addition,  we  have  to  zissert  that  the  state  of  the  other  tape  cells  stays 
constant: 

at-place(si,pi,t  1)  current~state(q,p2,S2,t)y 
at-place(si,pi,t),  time(t), 
symbol (si),  symbol (32),  state(q), 
p^ace(pi),p/ace(p2),not  equal (pi,p2)  - 

In  the  initial  configuration  all  tape  cells  that  are  not  part  of  the  input  are  empty: 


ai-p/ace(U,p,  1)  <—  p/ace(p),not  part- of -input (p)  .  (14) 

The  first  |a:;|  tape  cells  are  initialized  from  the  input  and  they  also  belong  to  the 
extension  of  part- of -input  jX.  Finally,  we  want  to  recognize  whether  the  Turing 
machine  halts  in  an  accepting  state  or  not: 


accept  <—  current- state(y,p,  s,t),place(p)^  symbol(s)^  time(t) 
reject  current-state(n,pyS^t),place(p)^  symbol(s)^time(t)  . 


(15) 


Note  that  we  have  not  yet  given  definitions  for  the  predicates  time f  I  and 
place  jX  that  encode  the  time  steps  and  tape  cells.  In  the  following  complexity 
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proofs  we  show  how  we  can  define  them  in  a  polynomial  number  of  rules  using 
tools  that  are  available  for  the  four  different  ^-restricted  program  classes. 

Since  all  rules  except  (13)  and  (14)  in  the  translation  are  negation-free  and 
both  negations  are  over  a  predicate  with  a  fixed  extension  that  is  linear  to  the 
size  of  the  input  program,  the  least  model  of  the  instantiation  can  be  found  in  a 
linear  time  with  respect  to  its  size  [5].  All  predicates  are  domain  predicates  and 
we  can  easily  find  whether  accept  or  reject  is  true  in  Mx)(p). 

We  can  generalize  the  translation  to  allow  non-deterministic  Turing  machines 
by  forcing  the  machine  to  choose  between  possible  transitions  at  all  computation 
steps.  Due  to  space  constraints,  we  do  not  include  the  details  here  but  one 
possible  translation  has  been  presented  by  V.  W.  Marek  and  J.  B.  Remmel  [9]. 
The  existence  of  such  a  translation  is  enough  to  prove  the  following  lemma: 

Lemma  1.  If  the  instantiation  problem  of  a  subclass  of  the  uj -restricted  pro¬ 
grams  is  C-complete  for  some  complexity  class  C,  the  corresponding  MODEL 
problem  is  NC-comp/ete. 

5.2  Complexity  Results 

Theorem  2.  The  instantiation  of  an  u-restricted  program  is  V-complete 
when  the  number  d  of  variables  occurring  in  it  is  fixed  (d  >  3)  and  no  func¬ 
tion  symbols  are  allowed. 

Proof  We  construct  the  proof  in  two  parts: 

(a)  Inclusion.  Let  P  be  a  program  with  d  distinct  variables.  Then,  each  rule 
has  at  most  ground  instances,  where  n  is  the  number  of  constants  in  the 
program. 

(b)  Hardness.  The  P-complete  problem  Boolean  circuit  value  [11]  can  be 
expressed  as  an  (^-restricted  logic  program  as  follows: 

true{G)  ^  nand-gate{G^L,R)^  false{L) 
true{G)  ^  nand-gate{G,  L,  R)^  false{R)  (16) 

false{G)<r—nand-gate{GjLyR)^true{L),true{R)  . 

Here  we  suppose  that  the  Boolean  circuit  is  implemented  using  only  not-and 
gates  and  that  the  truth  values  of  the  input  gates  are  given  as  facts. 

Corollary  1,  The  MODEL  problem  for  a  fixed  number  d  of  variables  and  no 
function  symbols  is  -complete,  if  d>  3. 

Theorem  3.  The  instantiation  of  an  unlimited-variable  u-restricted  program 
is  EXP-comp/ete  if  no  function  symbols  are  allowed. 

Proof.  For  inclusion,  see  Dantsin  et.al.  [4].  The  hardness  can  be  proved  by  noting 
that  a  deterministic  EXP-time  Turing  machine  M  uses  at  most  2*^*  time  steps 
for  some  k  when  the  length  of  the  input  is  n.  We  have  to  show  that  we  can 
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generate  an  exponential  number  of  atoms  representing  time  steps  and  tape  cells 
using  a  program  whose  size  is  polynomial  with  respect  to  the  size  of  M.  To  do 
this,  we  need  to  implement  a  n*^-bit  binary  counter  that  runs  from  0  to  2”^  —  1. 
This  can  be  done  by  encoding  the  numbers  as  vectors  of  binary  variables: 

number{0^ . . . ,  0) 

number(yi, . . .  ,2/nO  ^  ^  bitiy^k), 

number  (xi , . . . ,  x^k ) , 

next^xi ,  • .  • ,  sjyjk ,  yi » •  ■  •  ?  y-n'^ )  • 

The  predicate  bit /I  is  an  auxiliary  with  the  extension  {6ei(0),  bit{l)}  that  is  used 
to  ensure  that  the  rule  is  a;-restricted.  The  successor  relation  can  be  encoded  with 
the  rule: 


next(^xx, . . . , Xjik ,  , y-nk^  ^  1, yi,  ci), 

add(rr2,ci,2/2,C2), 


(18) 


add  {x^k ,  c^fc  _  1 , 2/jifc 


where  add/ 4  is  defined  using  the  following  four  facts: 

add(0, 1, 1, 0)  < —  add(0, 0, 0, 0)  < — 

add(l,  0,1,0)^  add(l,l,0,l)<-  .  ^  ^ 

We  can  implement  the  predecessor  function  by  switching  the  arguments  of  the 
next  predicate.  Now  the  time  steps  and  tape  positions  can  be  defined  in  terms 
of  numbers: 


time  {xi,...,Xnk)  ^  number  {xi , . . . , 
place{xi, . . . ,  x^k)  number (xi, . . . ,  . 

Finally,  we  replace  all  references  to  time /I  and  place /I  by  time/n^  and  place  jn^ 
and  add  all  necessary  domain  predicates  to  the  rule  bodies. 

Theorem  4.  The  instantiation  of  a  fixed-variable  u -restricted  program  that 
uses  function  symbols  is  EiXP -complete,  if  d>S. 


Proof. 

(a)  Inclusion.  Without  a  loss  of  generality  we  may  assume  that  there  are  k  strata 
with  c  rules  each  in  P.  Let  us  use  On  to  denote  the  number  of  ground  instances 
of  rules  that  belong  to  the  first  n  strata. 

Since  the  number  d  of  variables  is  fixed,  a  rule  on  the  n  -j-  1-stratum  may 
have  at  most  ground  instances.  Now  we  can  establish  an  upper  bound  for 
the  number  of  ground  instances  of  rules  on  the  stratum  n  -h  1  or  lower: 


an+i  =  c  •  +  fln  <  c  ♦  +  c  •  (when  d  >  1) 

=  2c  • 

—  2(^°S2  c)ci:”+(log2  C-|-l)d"'~^-"+(log2  c+l) 


(21) 
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As  both  c  and  d  are  linear  with  respect  to  the  size  of  the  program,  an  grows 
0(2”^^)  so  the  problem  is  in  EXP. 

(b)  Hardness.  As  in  the  proof  of  Theorem  3,  we  need  only  to  construct  a  binary 

fc 

counter  from  0  up  to  2”  —  1.  We  do  this  by  encoding  an  m-bit  binary 
number  a;  as  a  function  6i(62(-  •  •  6m(0)  •  •  • ))  where  bi  is  /  if  the  zth  bit  of  x 
is  0  and  i  if  it  is  1.  The  m-bit  binary  numbers  can  be  generated  recursively 
from  m  ~  1-bit  numbers  by  the  following  two  rules: 

nuTnberni{t{x))  numberm-i{x) 
number m{f{x))  numberm-i{x) 

Here  we  need  m  -h  1  different  number  predicates  since  otherwise  the  rules 
would  not  be  w-restricted.  As  the  basic  basic  case  of  the  recursion,  we  define 
one  0-bit  number  as: 

numberolO)  ^  .  (23) 

The  successor  relation  can  also  be  defined  recursively: 

nextm{t{x),t{y))  <-  nextm~i{x,y) 

nextm{f{x)J{y))  <-  nextm-i{x,y)  (24) 

nextm(f{x),t{y))  ^  lastm-i{x),  first ^_^{y) 

where  lastm/'^  and  first 1  are  defined  as: 

lastm(t^{0)) 

first^inO))^  .  ^  ^ 

The  translation  uses  7n^+3  rules  to  create  all  n^-bit  numbers  so  we  now  have 
a  polynomial  reduction  from  EXP-time  Turing  machines  to  a;-restricted 
programs  using  only  function  symbols  and  the  proof  is  completed. 

Corollary  2.  The  MODEL  problem  of  a  fixed-variable  uj -restricted  program  that 
uses  function  symbols  is  NEX.F -complete,  if  d>  8. 

Theorem  5.  The  instantiation  of  an  u-restricted  program  is  2-EXP-comp- 
lete. 


Proof.  We  can  combine  the  proofs  of  Theorems  3  and  4  to  see  that  the  problem 
is  in  2-EXP  and  that  it  is  possible  to  implement  all  2"^  -bit  integers  putting 
together  the  two  different  exponential  constructions. 

Corollary  3.  The  MODEL  problem  of  an  uj-restricted  program  is  2-NEXP- 
complete. 


Omega-Restricted  Logic  Programs  279 


6  Conclusions 

We  defined  a  new  class  of  logic  programs,  cj-restricted  programs,  that  are  decid¬ 
able  even  when  function  symbols  are  used.  We  showed  that  the  computational 
complexity  of  the  program  class  is  2-NEXP-complete.  If  we  make  further  re¬ 
strictions  either  by  fixing  the  maximum  number  of  variables  that  may  occur  in 
a  rule  or  by  disallowing  the  function  symbols,  the  complexity  drops  to  NEXP- 
complete.  If  both  restrictions  are  in  effect,  the  complexity  stays  NP-complete. 
We  have  implemented  the  a;-restricted  programs  in  the  S MODELS  system. 
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Abstract.  Most  Answer  Set  Programming  (ASP)  systems,  including 
DLV  and  Smodels,  are  endowed  with  an  instantiation  module.  The  in- 
stantiator  generates  a  new  program  which  is  equivalent  to  the  input 
program,  but  does  not  contain  any  variables  (i.e.,  it  is  ground).  Normal 
(i.e.,  disjunction-free)  stratified  programs  are  completely  solved  by  the 
instantiator,  which  generates  the  output  model  directly. 

The  instantiation  process  may  be  computationally  expensive  in  some 
cases,  and  the  instantiator  is  crucial  for  the  efficiency  of  the  entire  ASP 
system.  In  this  paper,  we  propose  to  employ  join-ordering  techniques  to 
improve  the  instantiation  process.  We  design  a  new  join-ordering  method, 
and  adapt  a  classical  database  method  to  this  context.  We  implement 
these  techniques  in  the  ASP  system  DLV,  and  we  carry  out  an  exper¬ 
imentation  activity  on  a  collection  of  benchmark  problems  taken  from 
different  domains.  The  results  of  experiments  are  very  positive,  the  new 
techniques  improve  sensibly  the  efficiency  of  the  DLV  system,  whose  in¬ 
stantiation  module  confirms  to  be  a  main  strong  point  of  DLV  w.r.t.  the 
other  ASP  systems. 


1  Introduction 

The  recent  implementation  of  knowledge  base  systems  which  efficiently  support 
expressive  logic-based  languages,  like  DLV  [5],  Smodels  [14],  DCS  [1],  XSB  [17] 
,  QUIP  [2],  and  CCALC  [13],  has  renewed  the  interest  in  the  area  of  non¬ 
monotonic  reasoning  and  declarative  logic  programming.  The  advances  made  in 
this  area  allow  us  to  use  ASP  systems,  like  DLV  and  Smodels,  for  solving  real- 
world  problems  in  a  number  of  application  areas,  including  planning,  scheduling 
as  well  as  for  complex  data  manipulations  [3]  [19].  For  instance,  Smodels  is  being 
used  for  the  automatic  configuration  of  software  distributions;  while  the  latest 
application  of  DLV,  issued  by  the  italian  national  statistics  institute  (ISTAT), 
concerns  the  automatic  correction  of  census  data. 

These  systems  support  a  fully  declarative  programming  style,  called  Answer 
Set  Programming  (ASP).  The  knowledge  representation  language  of  ASP  is  very 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  280-294,  2001. 
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expressive:  function-free  logic  programs  where  nonmonotonic  negation  may  occur 
in  the  bodies  of  the  rules,  and  possibly  (i.e.,  for  some  systems)  with  classical 
negation  and  disjunction  in  the  heads  of  the  rules.  The  semantics  of  an  ASP 
program  P  is  given  by  its  answer  sets  [10],  which  are  subset-minimal  models 
of  P,  and  are  “grounded”  in  a  precise  sense.  The  idea  of  answer  set  programming 
is  to  represent  a  given  computational  problem  by  an  ASP  program  whose  answer 
sets  correspond  to  solutions,  and  then  use  an  answer  set  solver  to  find  such  a 
solution  [12]. 

As  an  example,  consider  the  well-known  problem  of  3-colorability,  which  is 
the  assignment  of  three  colors  to  the  nodes  of  a  graph  in  such  a  way  that  adjacent 
nodes  have  different  colors.  This  problem  is  known  to  be  NP-complete.  Suppose 
that  the  nodes  and  the  arcs  axe  represented  by  a  set  F  of  facts  with  predicates 
node  (unary)  and  arc  (binary),  respectively  {node  and  arc  can  be  stored  in  the 
tables  representing  the  input  database).  Then,  the  following  ASP  program  allows 
us  to  determine  the  admissible  ways  of  coloring  the  given  graph. 

ri  :  color{X,  r)  V  color{X,  y)  V  color{X,  g)  <—  node{X) 
ra  :  arc{X,  T),  color {X,  C),  color{Y,  C) 

Rule  ri  above  states  that  every  node  of  the  graph  is  colored  red  or  yellow  or 
green,  while  r2  forbids  the  assignment  of  the  same  color  to  any  adjacent  nodes. 
The  minimality  of  answer  sets  guarantees  that  every  node  is  assigned  only  one 
color.  Thus,  there  is  a  one-to-one  correspondence  between  the  solutions  of  the 
3-coloring  problem  and  the  answer  sets  of  P  U  {ri,r2}.  The  graph  is  3-colorable 
if  and  only  if  P  U  {n, 7*2}  has  some  answer  set. 

ASP  is  very  expressive:  every  problem  in  the  complexity  class  X2  (i-e.,  in 
NP^^)  can  be  directly  encoded  in  an  ASP  program  which  can  then  be  used 
to  solve  all  problem  instances  in  a  uniform  way  [4].  The  high  expressiveness 
of  answer  set  programming  comes  at  the  price  of  a  high  computational  cost  in 
the  worst  case.  Indeed,  computing  an  answer  set  of  a  disjunctive  (resp.  normal) 
propositional  ASP  program  is  I^J^-hard  (resp.,  NP-hard).  The  design  and  the 
implementation  of  suitable  optimization  techniques  is  therefore  fundamental  for 
ASP  systems. 

The  kernel  modules  of  the  ASP  systems  operate  on  a  ground  instantiation 
of  the  input  program,  i.e.,  a  program  that  does  not  contain  any  variables,  but 
is  (semantically)  equivalent  to  the  original  input  [5].  Therefore,  an  efficient  in¬ 
stantiation  procedure  is  of  utmost  importance.^  The  efficiency  of  an  instantiation 
procedure  can  be  measured  in  terms  of  the  size  of  its  output  and  the  time  needed 
to  generate  this  instantiation.  In  a  previous  work,  the  DLV  team  has  presented 
some  rewriting  techniques  which  reduce  the  size  of  the  generated  grounding  in¬ 
stantiation  [6].  In  this  paper  we  optimize  the  execution  time  needed  to  generate 
the  grounding  instantiation.  The  main  contribution  of  the  paper  is  the  follow¬ 
ing: 

-  We  propose  the  use  of  join-ordering  techniques  to  improve  the  efiiciency  of 
the  instantiation  procedures  of  ASP  systems.  In  particular,  a  join-optimization 

^  Note  that  the  disjunction-free  stratified  programs  are  “solved”  by  the  instantiation 
procedure,  which  provides  the  answer  and  does  not  generate  any  instantiation  in  this 
case.  Thus,  the  instantiator  alone  has  the  full  power  of  a  deductive  database  system. 
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technique  can  be  employed  to  re-order  the  body  literals  of  a  rule  during  the 
instantiation  process. 

-  We  design  a  new  join-ordering  method,  and  adapt  a  classical  database  method 
to  our  context. 

-  We  implement  the  above  join-ordering  methods  in  the  ASP  system  DLV. 

-  To  check  the  impact  of  our  methods  on  the  instantiator  of  DLV,  we  experi¬ 
mentally  compare  the  techniques  that  we  implemented. 

~  To  assess  the  validity  of  our  results  more  in  general,  we  compare  also  the  in¬ 
stantiator  of  DLV,  resulting  from  our  enhancements,  to  the  newest  version  of 
the  instantiator  of  Smodels,  released  on  March  2001. 

The  results  of  the  experiments  are  very  positive,  it  seems  that  the  new  tech¬ 
niques  improve  sensibly  the  efficiency  of  the  DLV  instantiator,  which  compares 
favourably  against  the  instantiator  of  Smodels. 

2  The  Instantiation  Procedure  of  dlv 

In  this  section,  we  provide  a  short  description  of  the  overall  instantiation  module 
of  the  DLV  system,  and  focus  on  the  “heart”  procedure  of  this  module  which 
produces  all  ground  instances  of  a  given  rule,  which  will  be  optimized  in  the  next 
sections  through  the  introduction  of  the  join-ordering  methods.  We  assume  that 
the  reader  is  familiar  with  ASP  syntax  and  semantics.  An  extensive  description 
can  be  found  in  [10]  and  [3]. 


Fig,  1.  Architecture  of  DLV’s  Instantiator 


The  aim  of  the  instantiator  is  mainly  twofold:  (i)  to  evaluate  (V-free)  stratified 
programs  components,  and  (ii)  to  generate  the  instantiation  of  disjunctive  or 
unstratified  components  (if  the  input  program  is  disjunctive  or  unstratified). 

In  order  to  evaluate  efficiently  stratified  programs  (components),  DLV  uses 
an  improved  version  of  the  generalized  semi-naive  technique  [20]  implemented 
for  the  evaluation  of  linear  and  non-linear  recursive  rules. 
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If  the  input  program  is  normal  (i.e.,  V-free)  and  stratified,  the  instantiator 
evaluates  completely  the  program  and  no  further  module  is  employed  after  the 
grounding;  the  program  has  a  single  answer  set,  namely  the  set  of  the  facts  and 
the  atoms  derived  by  the  instantiation  procedure.  If  the  input  program  is  disjunc¬ 
tive  or  unstratified,  the  instantiation  procedure  cannot  evaluate  completely  the 
program.  However,  the  optimization  techniques  mentioned  above  are  useful  to 
generate  efficiently  the  instantiation  of  the  non-monotonic  part  of  the  program. 
Two  aspects  are  crucial  for  the  instantiation: 

(a)  the  number  of  generated  ground  rules, 

(b)  the  time  needed  to  generate  such  an  instantiation. 

The  size  of  the  generated  instantiation  is  important  because  it  strongly  influences 
the  computation  time  of  the  other  modules  of  the  system.  A  slower  instantiation 
procedure  generating  a  smaller  grounding  may  be  preferable  to  a  faster  one 
generating  a  large  grounding.  However,  the  time  needed  by  the  former  can  not 
be  ignored  otherwise  we  could  not  really  have  a  computation  time  gain. 

The  main  reason  of  large  groundings  even  for  small  input  programs  is  that 
each  atom  of  a  rule  in  V  may  be  instantiated  to  many  atoms  in  Bp,  which  leads 
to  combinatorial  explosion.  However,  most  of  these  atoms  may  not  be  derivable 
whatsoever,  and  hence  such  instantiations  do  not  render  applicable  rules.  The 
instantiator  module  generates  ground  instances  of  rules  containing  only  atoms 
which  can  possibly  be  derived  from  V. 

In  Figure  1  we  have  depicted  the  general  structure  of  the  instantiator  mod¬ 
ule.  An  input  program  V  is  first  analyzed  from  the  parser,  which  also  builds  the 
extensional  database  from  the  facts  in  the  program,  and  encodes  the  rules  in 
the  intensional  database  in  a  suitable  way.  Then,  a  rewriting  procedure  (see  [6]), 
optimizes  the  rules  in  order  to  get  an  equivalent  program  V'  that  can  be  in¬ 
stantiated  more  efficiently  and  that  can  lead  to  a  smaller  ground  program.  The 
dependency  graph  (DG)  builder  computes  the  dependency  graph  of  its  con¬ 
nected  components,  and  a  topological  ordering  of  these  components.  Finally,  V' 
is  instantiated  one  component  at  a  time,  starting  from  the  lowest  components  in 
the  topological  ordering,  i.e.,  those  components  that  depend  on  no  other  com¬ 
ponent  according  to  the  dependency  graph. 

For  space  reasons  we  omit  a  detailed  description  of  the  whole  instantiation 
algorithm  here.  The  interested  reader  can  find  the  instantiation  algorithm  in  the 
technical  report  [7] .  Below,  we  describe  the  process  of  rule’s  instantiation  -  the 
“heart”  of  the  instantiation  module  -  which  we  optimize  in  the  next  section  by 
introducing  join-ordering  methods. 

Let  us  first  introduce  some  notations.  We  denote  by  H{r)  the  set  {ai, ...,  a„} 
of  the  head  atoms,  and  by  B{r)  the  set  {6i,...,6fc, -16^+1,  oi  the  body 

literals.  B+(r)  (resp.,  B~{r))  denotes  the  set  of  atoms  occurring  positively  (resp., 
negatively)  in  B{r).  For  a  literal  L,  var{L)  denotes  the  set  of  variables  occurring 
in  L.  For  a  conjunction  (or  a  set)  of  literals  C,  var{C)  denotes  the  set  of  variables 
occurring  in  the  literals  in  (7,  and,  for  a  rule  r,  var{r)  =  var{H{r))Uvar{B{r)). 

The  procedure  InstantiateRule,  shown  in  Figure  2,  generates  the  ground  in¬ 
stances  of  a  rule  r  of  a  program  V.  When  this  procedure  is  called  for  the  rule  r. 
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Forward  Procedure  FiTstMatch{e:  Substitution,  A:  Atom,  var  MatchFound: 

Boolean,  var  O':  Substitution); 

(*  Given  a  partial  substitution  $  for  the  rule’s  variables,  and  an  atom  A  of  the  body, 
the  procedure  computes  the  first  tuple  t  of  the  relation  corresponding  to  A  which 
matches  with  0.  It  returns  in  0'  the  extension  of  0,  where  the  free  variables  of  A 
have  been  instantiated  with  the  corresponding  constants  in  t.  The  boolean  variable 
MatchFound  evaluates  True  iff  such  a  matching  tuple  has  been  found;  otherwise  it 
evaluates  False,  and  6'  is  meaningless.  *) 

Forward  Procedure  NextMatch{0:  Substitution,  A:  Atom,  var  MatchFound: 
Boolean,  var  6  :  Substitution);  (*  Similar  to  FirstMatch,  but  finds  the  next  matching 
tuple.  *) 

Function  InstantiateConjunction{C:  Conjunction;  0:  Substitution)  :  SetOfSubsts; 
var  MatchFoundiBoolean;  AiAtom;  B: Conjunction;  0': Substitution;  iS':SetOfSubsts; 
begin 

if  C  is  empty  (*  the  end  of  the  body  has  been  reached, 

0  is  a  legal  substitution  *) 
then  return({^}); 

5  :=  0;  A  :=  first_conjunct(C);  B  :=  rest_conjunct(C); 

FirstMatch(0,  A, MatchFound,^'); 
while  MatchFound  do 

S  :=  S  U  InstantiateConjunction(B,0'); 

NextMatch(0,A,MatchFound,0'); 
end_while; 
ret  urn  (S); 

end; 

Function  InstantiateRule{r:  Rule):  SetOfGroundRules; 

var  0:  Substitution;  S:  SetOfSubstitutions; 

begin 

Let  denote  the  Conjunction  of  the  positive  literals  in  the  body  of  r; 
Order_Body(B+); 

0  ;=  empty  substitution; 

S  :=  InstantiateConjunction(B;^,0); 
return  ({7r  |  7  €  S}); 

end; 


Fig.  2.  The  process  of  rule’s  instantiation 

for  each  atom  A  occurring  in  the  body  of  r,  the  set  of  ground  instances  I  a  for  A 
previously  computed  by  the  instantiator  is  collected  in  a  relation  rel{A)  that  we 
call  the  extension  of  A.  Each  ground  instance  for  a  €  Ia  corresponds  to  a  tuple 
in  rel{A)  and  vice  versa.  More  precisely,  each  tuple  of  constants  in  the  relation 
rel{A)  corresponds  to  a  substitution  6  :  var  {A)  — >•  U'p  such  that  9A  e  lay  and 
vice  versa.  Such  a  substitution  9  is  called  a  valid  substitution  for  r  with  respect 
to  the  given  extensions  of  the  atoms  occurring  in  its  body.  Intuitively,  Instan- 
tiateRule  performs  the  natural  join  of  the  relations  associated  with  the  positive 
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body  literals  of  the  rule.  Since  the  rule  is  safe,  each  rule’s  variable  appears  also 
in  a  positive  body  literal  of  the  rule.  Therefore,  such  a  join  is  in  a  one-to-one 
correspondence  with  the  set  of  all  ground  instances  of  the  rule  which  are  con- 
structable  from  the  set  of  available  instances  for  the  body  atoms;  each  tuple 
of  this  join  corresponds  to  a  valid  substitution  for  r  with  respect  to  the  given 
extensions  of  atoms. 

Roughly,  Inst  ant  iateRule  first  orders  the  conjunction  of  the  positive  body 
literals  of  the  rule  r  to  be  instantiated  (by  a  call  to  procedure  Order  JBody,  which 
will  be  described  in  the  next  section) ,  and  then  calls  the  function  InstantiateCon- 
junction  which  actually  computes  the  legal  instantiations  of  .  This  function 
starts  from  the  first  atom  A  in  the  conjunction  .  It  finds,  by  a  call  to  function 
FirstMatch,  the  first  tuple  t  matching  with  A  in  the  relation  rel(A)  associated 
with  A,  and  binds  its  variables  to  the  corresponding  constants  in  t  Then,  by 
a  recursive  call  to  InstantiateConjunction  itself,  this  function  takes  the  second 
atom,  say  A'  in  B+,  and  binds  its  free  variables  (note  that  some  variables  of  A' 
are  already  bound,  if  they  appear  also  in  A)  by  finding  the  first  matching  tuple 
in  rel{A').  The  process  goes  on  until  either  (i)  the  end  of  the  conjunction  has 
been  reached  {C  is  empty  in  function  InstantiateConjunction),  or  (ii)  no  match¬ 
ing  tuple  is  found  for  some  body  atom  (a  call  to  a  match  function  returned 
MatchFound=False).  In  the  latter  case,  the  (partial)  substitution  6  at  hand  is 
not  good,  since  no  instance  of  the  current  atom  agrees  with  0.  Therefore,  the 
current  run  of  function  InstantiateConjunction  terminates,  the  calling  function 
changes  9  by  finding  another  matching  tuple,  and  restarts  the  forward  instanti¬ 
ation  phase.  In  the  former  case  (i.e.,  in  case  (i)),  the  substitution  at  hand  (the 
parameter  9  previously  computed  by  the  matching  functions)  is  returned,  as  it 
instantiates  all  rule’s  variables  and  hence  induces  a  ground  instance  of  the  rule  r. 
The  calling  function  adds  9  to  the  set  S  of  the  computed  substitutions,  and  finds 
another  match  for  the  atom  at  its  hand  to  generate  further  ground  instances  of  r. 
The  process  terminates  when  no  more  match  are  found  (i.e.,  no  more  ground 
instances  can  be  generated). 


3  Join-Ordering  Methods 

From  the  previous  section,  it  should  be  clear  enough  that  computing  all  the  pos¬ 
sible  instantiations  of  a  rule  given  the  relations  associated  to  the  atoms  occurring 
in  its  body  is  equivalent  to  computing  all  the  answers  of  the  conjunctive  query 
joining  the  relations  of  the  positive  literals  of  the  rule’s  body.  A  key  issue  for  the 
eflScient  instantiation  of  (the  non-trivial  rule)  r  is  thus  the  optimal  ordering  of 
literals  in  the  body.  This  problem  clearly  corresponds  to  the  choice  of  an  optimal 
execution  ordering  for  the  join  operations  in  a  conjunctive  query. 

A  good  ordering  dramatically  affects  the  overall  computation  time.  Many 
relevant  real-world  examples  containing  large  relations  (see  next  section)  cannot 
be  solved  without  a  suitable  ordering  of  the  body  atoms. 

It  is  worthwhile  noting  that  in  ASP  programs  we  have  to  instantiate  many 
rules  and,  for  recursive  programs,  we  have  to  instantiate  the  same  rule  many 
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times,  possibly  with  different  relations  (until  we  reach  a  fixpoint).  Therefore, 
the  ordering  procedure  is  called  very  often  and  should  be  done  efficiently. 

Let  r  be  a  rule  and  the  conjunction  of  the  atoms  occurring  in  its  body. 
Our  procedure  Order.Body  gets  as  its  input  and  modifies  the  ordering  of 
atoms  in  this  conjunction  in  order  to  minimize  the  instantiation  time  of  r. 

To  choose  an  optimal  ordering,  we  exploit  some  information  about  the  rela¬ 
tions  associated  to  the  atoms  in  B^,  also  called  the  extensions  of  these  atoms. 
For  each  atom  A  occurring  in  B+,  we  know  the  number  T{A)  of  tuples  in  its  as¬ 
sociated  relation  rel{A)  and,  for  each  variable  X  E  var{A),  the  number  V(X,A) 
of  distinct  values  for  X  over  rel{A)  (i.e.,  the  number  of  tuples  in  the  projection 
of  rel{A)  onto  X). 

Recall  that  the  relations  associated  to  the  atoms  of  r  change,  in  general,  at 
each  call  to  InstantiateRule(r).  Of  course,  there  is  more  than  one  call  only  if  r 
belongs  to  some  recursive  component  of  the  program. 

In  order  to  meet  both  the  requirements  of  efficiency  of  the  optimization  pro¬ 
cedure  and  of  efficiency  of  the  instantiation  procedure,  we  employed  a  greedy 
algorithm,  very  similar  to  the  one  used  in  traditional  database  systems  for  se¬ 
lecting  an  optimal  left-deep  join  tree  for  a  given  conjunctive  query  [9].  Roughly, 
at  each  step  i  >  1,  we  have  placed  the  first  i  —  1  atoms,  and  we  make  a  greedy 
choice  to  select  the  zth  atom  in  the  final  ordering  of  B+.  This  atom,  say  A,  is 
chosen  if  A  is  minimal  with  respect  to  some  selectivity  criterion. 

For  the  sake  of  simplicity,  we  will  use  the  name  of  an  atom  A  to  represent 
both  the  atom  and  its  extension  re/ (A),  whenever  no  confusion  arises.  We  will 
denote  by  Bi_i  the  set  of  the  first  i  —  1  atoms  and  by  rel{Bi-.\)  (or,  simply, 
by  Bi-\)  the  relation  obtained  computing  the  join  of  all  the  extensions  of  the  first 
i  -  1  atoms.  Hence,  the  number  of  tuples  T(Bi_i)  in  this  relation  is  equal  to  the 
number  of  consistent  substitutions  for  the  atoms  in  B(z  —  1).  Moreover,  we  denote 
by  var[Bi^i)  the  set  of  variables  occurring  in  Bj-i.  These  variables  are  called 
the  bound  variables  at  step  i.  Therefore,  for  any  bound  variable  X  G  var{Bi-i)y 
V(X^  Bi-i)  is  the  number  of  distinct  values  that  X  may  take  over  the  computed 
relation  Bi_i  (or,  equivalently,  over  the  set  of  consistent  substitutions  for  the 
atoms  in  Bi_i).  We  estimate  these  numbers  during  the  ordering  procedure  from 
the  statistics  we  have  for  the  single  atoms,  rather  than  explicitly  computing  them 
at  each  step.  Indeed,  in  this  phase,  we  do  not  compute  any  join  (or  substitution). 

We  next  describe  three  selectivity  criteria  that  we  implemented  in  the  DLV 
system.  The  first  is  the  one  used  in  the  current  version  of  DLV  [8],  the  second 
is  an  adapted  version  of  a  criterion  used  in  the  context  of  traditional  database 
systems  [9],  and  the  third  is  specifically  designed  for  our  purposes. 


3.1  Old-DLV  Criterion 

This  is  the  simple  method  implemented  in  the  current  versions  of  DLV  [8].  Let  D 
be  the  set  of  all  atoms  in  B+  —  Bi_i  having  some  bound  variable  at  this  step, 
i.e.,  having  a  variable  in  common  with  some  atom  in  B^-i. 

We  select  the  atom  A  to  be  placed  in  the  ith  position  of  the  ordered  body  as 
follows: 
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—  if  D  ^  0,  then  A  is  the  atom  belonging  to  D  whose  extension  has  the  smallest 
cardinality  over  the  atoms  in  D; 

—  otherwise,  i.e.,  no  remaining  atom  has  any  bound  variable  (at  step  z),  A  is 
the  atom  whose  extension  has  the  smallest  cardinality  over  all  atoms  in  - 
Bi-i. 

Therefore,  this  method  gives  the  maximum  priority  to  the  binding  of  variables, 
and  then  chooses  on  the  basis  of  the  cardinality  of  the  extensions. 

Example  1.  Assume  that  we  are  computing  the  ground  instantiation  of  a  rule  r 
and  that  we  already  placed  the  first  i  —  1  atoms.  Let  X  and  Y  be  the  bound 
variables  at  step  z,  i.e.,  var{Bi-i)  —  {X,  T}.  Moreover,  Let  P(X,  X'),  Q(y,y'), 
and  i7(X,y,X',y^)  the  remaining  atoms  in  the  body  of  r,  i.e.,  the  atoms  in 
B{r)~Bi-i,  and  assume  the  number  of  tuples  in  their  extensions  are  T{P)  —  30, 
T{Q)  ~  6,  and  T{R)  =  300,  respectively. 

The  Old-DLV  criterion  first  looks  for  atoms  having  some  bound  variable.  In 
this  example,  either  X  or  y  occurs  in  every  remaining  atom.  Thus,  the  Old- 
DLV  criterion  chooses  Q(y,y^)  as  the  ith  atom  because  its  extension  has  the 
smallest  cardinality. 

3.2  Join  Selectivity  Criterion 

This  method  is  widely  used  in  relational  database  systems  [9].  We  take  as  the  ith 
atom  in  the  ordered  body  the  atom  A  G  Bf  —  Bi-i  that  minimizes  the  following 
selectivity  index:  selj{A)  =  T(Bi_i  M  A)/T{Bi-i). 

Thus,  we  take  the  atom  A  which  leads  to  the  smallest  intermediate  relation  size, 
over  all  atoms  that  are  still  to  be  ordered. 

The  size  of  a  join  operation  between  two  relations  R  and  S  is  T{R)  ♦  T{S)  if 
they  do  not  have  any  variable  in  common;  otherwise,  it  is  estimated  as  follows: 

_ T{R)-T{S) _ 

^  ’  Ylx^.ariR)n.ar(S)  inax{y  (X,  R),  V(X,  S)}  ’ 

where  Yl  denotes  the  product  operation. 

Example  2.  Consider  again  the  rule  r  in  Example  1.  We  next  show  how  the  ith 
atom  is  chosen  according  to  the  join-selectivity  criterion.  In  this  case,  we  need 
some  additional  statistics.  Let  V{X^Bi-i)  =  30  and  V{Y^Bi-i)  =  5  be  the 
current  estimation  for  the  number  of  different  values  for  the  bound  variables  at 
step  i,  i.e.,  for  X  and  Y.  Moreover,  assume  the  statistics  for  the  bound  variables 
occurring  in  the  remaining  atoms  are  V (X,  P)  =  30,  V (y,  Q)  =  5,  V (X,  B)  =  30, 
and  y(y,B)  =  5.  Consider  the  atom  P(X,X').  The  Join-selectivity  criterion 
assigns  to  this  atom  the  following  selectivity  index: 

seliP)  = _ _ 1_  ^  30  ^ 

max{V(X,Si_i),V(X,P)}  T(Bi-i)  30 

Similarly,  the  other  atoms  get  sel{Q)  =  1.2  and  sel{R)  =  2.  Thus,  according  to 
the  Join-selectivity  criterion,  the  ith  atom  is  P(X,  X'). 
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Note  that  the  above  estimation  of  the  size  of  a  join  is  based  on  the  following 
simplifying  assumptions: 

Containment  of  value  sets.  If  V (Xj  R)  <  V (X,  S),  then  every  possible  value 
for  variable  X  in  i?  is  also  a  possible  value  for  X  in  S. 

Preservation  of  value  sets.  If  V  e  var(S)  is  not  a  join  attribute,  i.e.,  V  ^ 
var(R)  n  var(S),  then  V(y,  JR  1x3  5)  =  V(V,S).  That  is,  performing  a  join 
operation,  we  do  not  lose  values  for  non-join  variables. 

The  interested  reader  can  find  a  more  detailed  discussion  of  these  assumptions 
and  of  this  selectivity  criterion  in  [9]. 

Here,  we  just  observe  that,  because  of  the  above  assumptions,  after  the  choice 
of  atom  we  update  the  statistics  of  the  value  sets  of  its  variables  as  follows: 
V{X,Bi)  =  min{V(X,i4),  y(X,.Bi_i)},  if  X  is  a  bound  variable  at  step  i\  oth¬ 
erwise,  F(X,  Bi)  -  V(X,  A). 


3.3  Combined  Criterion 


This  selectivity  criterion  explicitly  deals  with  both  the  size  of  the  intermediate 
result  and  the  binding  of  variables,  trying  to  minimize  both  these  factors. 

For  this  criterion  we  exploit,  as  additional  statistics,  the  size  of  the  (ac¬ 
tive)  domains  for  the  variables  occurring  in  r  (with  respect  to  the  current 
call  of  InstantiateRule(r)).  We  estimate  this  number,  denoted  by  dom(X),  by 
In  other  words,  we  assume  that  there  is  a  relation  rel(A), 
associated  to  some  A  G  Hjf,  which  provides  the  active  domain  for  X,  i.e.,  which 
contains  all  values  for  X  that  also  occur  in  the  extensions  of  other  atoms  in  r. 
In  practice,  this  is  the  case  most  of  the  times  and,  if  not,  dom{X)  is  usually  very 
close  to  the  cardinality  of  the  actual  domain  for  X. 

The  combined  criterion  takes  as  the  ith  atom  in  the  ordered  body  the  atom 
A  G  J5+  ~  that  minimizes  the  selectivity  index  seldA)  —  seZ5(A)  •  selb{A), 
where 


sels{A)  = 


TjBj.i  >.  A) 
Ylxez  <iom{X) 


and 


sek{A)  =  JJ 

YGvar(Bi-i)r\var{A) 


vjy.A) 

dom(F)2’ 


where  Z  is  the  set  of  variables  that  A  has  in  common  with  some  other  atom 
occurring  in  B+,  >}  denotes  the  semijoin  operation,  and  selb{A)  —  1  in  the 
trivial  case  mr(j5i_i)  f]  var{A)  =  0. 


Example  3.  We  show  how  the  combined  criterion  acts  on  the  same  rule  consid¬ 
ered  in  Example  1  and  Example  2.  We  estimate  the  cardinality  of  the  active 
domains  for  the  variables.  Prom  the  given  statistics,  we  get  dom{X)  —  100, 
dom(y)  =  100,  dom(X')  =  20  and  dom{Y')  =  20.  For  the  atom  P(X,X'), 
according  to  the  Combined  criterion,  we  compute 


TIB,..  .  P)  =  T(P) .  -  30 .  ^  -  9,  „d  henc. 


{p\  =  P(Pi-l  P) 

^  dom{X)  ■  doTn{X') 


9  ,  F(X,P)  30 

100-20  ^  ^  dom(X)2  1002 
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Then,  the  selectivity  index  for  P  is 

selc{P)  =  sels{P)  *  selbiP)  =  *  ^902 

Similarly,  for  Q  and  /?,  we  get  seldQ)  =  7.5e“®  and  selc{R)  =  1.69e“^^.  It 
follows  that,  according  to  the  Combined  criterion  we  choose  R{X,Y^ X' ^Y')  as 
the  ith  atom  in  the  ordered  body  of  r.  Note  that,  after  the  choice  of  this  atom, 
all  the  variables  become  bound. 

Note  that  the  selectivity  index  sels(A)  is  a  measure  of  how  much  the  choice 
of  A  reduces  the  search  space  for  possible  substitutions.  In  fact,  for  the  set  of 
variables  Z  that  A  has  in  common  with  some  other  atom  in  the  full  search 
space  counts  Ylxez  dom(X)  possible  substitutions  (or,  equivalently,  this  is  the 
size  of  the  full  relation  over  these  variables).  However,  only  m  —  T{Bi-i  >i  A) 
tuples  of  A  are  compatible  with  the  previously  chosen  atoms.  Thus,  m  represents 
the  new  maximum  number  of  tuples  of  values  for  the  variables  in  Z.  Note  that 
this  criterion  leans  to  prefer  the  atoms  with  large  arities.  Assume  the  extensions 
of  two  atoms  A'  and  A"  have  the  same  cardinality,  that  the  domains  of  all 
variables  are  the  same,  and  that  the  arity  of  A'  is  greater  than  the  arity  of  A". 
Then,  selc{A')  and  selc{A'')  have  the  same  numerators,  however  selc{A')  has  a 
bigger  denominator,  and  in  fact,  most  likely,  it  provides  a  better  reduction  of 
the  search  space. 

The  selectivity  index  selb{A)  takes  into  account  the  bound  variables  of  A. 
Indeed,  by  preferring  atoms  with  already  bound  variables,  we  may  detect  very 
fast  possible  inconsistencies.  The  index  selb{A)  is  1,  if  A  has  no  variable  in 
common  with  the  previously  chosen  atoms;  otherwise,  it  is  always  <  1.  It  leans 
to  prefer  the  atoms  with  the  large  number  of  bound  variables,  and  having  the 
smaller  fraction  of  values  with  respect  to  the  full  domain  CEirdinalities.  Indeed, 
these  atoms  are  the  most  promising  for  detecting  possible  inconsistencies  with 
the  previously  chosen  atoms. 

In  the  implementation  of  this  criterion,  we  make  a  further  use  of  variables’ 
domains  for  removing  the  assumption  about  containment  of  value  sets.  However, 
we  keep  the  assumption,  implicit  in  the  classical  join-size  estimation,  that  values 
are  distributed  uniformly  over  their  domains.  It  follows  that  the  size  of  the 
semiioin  operation  can  be  estimated  as  follows: 

TiR.S)=nS).  n 

X  £var{R)r\var{S) 

Moreover,  after  we  choice  an  atom  A,  we  update  the  statistics  of  the  value 
sets  of  its  variables  as  follows:  V (X,  Bi)  =V (X,  Bi-i)  •  (V^ (X,  A)/ dom(X)),  if  X 
is  a  bound  variable  at  step  i]  otherwise,  ^(X,  Bi)  =  ^(X,  A). 

4  Experimental  Results  and  Conclusion 

4.1  Benchmark  Programs 

In  order  to  check  the  efficiency  of  the  proposed  methods,  we  have  implemented 
the  methods  in  the  grounding  engine  of  the  DLV  system,  and  we  have  run  them 
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on  a  collection  of  benchmark  programs  taken  from  different  domains.  We  mainly 
selected  programs  where  the  instantiation  process  is  hard,  and  it  takes  a  rele¬ 
vant  part  of  the  entire  computation  (like,  e.g.,  CRISTAL,  HANOI,  RAMSEY), 
but  we  considered  also  a  couple  of  problems  where  the  instantiation  process  is 
easy  compared  to  the  process  of  model  generation  (like  BLOCKSWORLD  and 
HAMILTONIAN-PATH).  For  space  limitation,  we  cannot  include  the  code  of  the 
benchmark  programs  in  the  paper.  Rather,  we  provide  below  a  very  short  descrip¬ 
tion  of  the  problems  which  are  encoded  in  the  benchmark  programs.  The  pro¬ 
grams  encoding  these  problems,  as  well  as  the  binaries  used  for  our  experiments, 
can  be  found  at  url:  www.dbai.tuwien.ac.at/staff/leone/join-ordering/ 

RAMSEY(3,6)  ^  17  Prove  that  17  is  not  the  Ramsey  number  Ramsey (3,6)  [16]. 
HANOI[6discs,63steps]  Hanoi  Towers  with  6  discs  and  63  steps. 

CRISTAL  deductive  databases  application  that  involves  complex  knowledge 
manipulations  on  databases,  developed  at  CERN  in  Switzerland. 
K-DECOMP  Decide  whether  a  conjunctive  query  has  hypertree  width  at  most 
K  [11]. 

TIMETABLING  A  timetable  problem  for  the  first  year  of  the  faculty  of  Science 
of  the  University  of  Calabria. 

HAMILTONIAN  PATH  Hamiltonian  Path  on  a  random  graph  with  700  edges 
and  85  nodes. 

BLOCKSWORLD  A  t3q)ical  planning  problem  where  some  blocks,  placed  on  a 
table,  have  to  be  moved  from  an  initial  position  to  a  desidered  final  position. 
CONSTRAINT-3COL  3col,  constraint-satisfaction  like  encoding,  on  a  graph 
with  30  nodes  and  40  edges. 


4.2  Old-DLV  Instantiator  vs.  the  New  Methods 

We  implemented  in  DLV  the  three  criteria  described  in  Section  3  and  we  com¬ 
pared  them  by  using  the  above  benchmark  problems.  All  experiments  were 
performed  on  an  Athlon/750  machine  with  256MB  of  main  memory  running 
FreeBSD  4.2.  The  binaries  were  produced  with  GCC  2.95.2. 


Table  1.  A  comparison  of  the  join-ordering  methods  of  Section  3 


Program 

OW-DLV 

JoinSel 

Combined 

RAMSEY(3,6)7^  17 

64.10 

8.98 

8.50 

HANOI[6discs,63steps] 

12.20 

71.65 

14.58 

CRISTAL 

19.53 

14.73 

13.37 

K-DECOMP 

30.78 

37.84 

29.82 

TIMETABLING 

283.15 

269.03 

238.35 

HAMILTONIAN-PATH 

2.55 

2.43 

2.41 

BLOCKSWORLD 

3.17 

3.48 

2.99 

CONSTRAINT-3COL 

84.01 

34.98 

31.64 
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The  results  of  our  tests  are  shown  in  Table  1.  There,  the  first  column  de¬ 
scribes  the  benchmark  program;  Colums  2-4  report  the  running  times  employed 
to  generate  the  instantiation  by  DLV,  when  method  O^d-DLV,  JoinSel  and 
Combined  is  used,  respectively.  All  running  times  are  expressed  in  seconds. 

Old-DLV,  the  original  technique  employed  in  the  DLV  system,  is  the  worst 
in  most  cases  and  it  is  outperformed  by  both  JoinSel  and  Combined.  It  is  worth 
noting,  however,  that  Old-DLVcriterion  is  not  a  “naive”  method,  it  takes  into 
account  both  the  binding  of  variables  and  the  size  of  the  extensions  of  atoms. 
In  fact,  it  performs  quite  well  on  a  number  of  problems,  e.g.,  HAMILTONIAN- 
PATH,  BLOCKSWORLD,  and  K-DECOMP.  In  particular,  in  the  latter  case,  it 
is  better  than  the  Join-selectivity  approach.  However,  it  gets  worse  for  problems 
where  rules  contain  many  atoms  or /and  atoms  with  large  extensions,  like,  e.g., 
RAMSEY  and  TIMETABLING. 

The  Join-selectivity  criterion  guarantees  good  performance  on  a  large  number 
of  programs,  because  it  is  based  on  the  minimization  of  the  intermediate  partial 
relation  computed  at  each  step.  In  some  way,  its  formulation  also  takes  into 
account  the  binding  of  variables.  Indeed,  a  larger  number  of  bound  variables  in 
an  atom  leads  to  more  selective  joins  (i.e.,  joins  with  a  smaller  index). 

The  combined  criterion  yields  the  best  performance  for  the  considered  prob¬ 
lems  on  average.  The  main  advantage  of  this  criterion  comes  from  the  exploita¬ 
tion  that  large  arity  atoms  can  reduce  the  number  of  allowed  substitutions  for 
many  variables  at  once,  provided  that  their  extensions  are  not  too  big.  For  this 
reason,  the  procedure  based  on  this  method  outperforms  the  Join-selectivity 
method  on  HANOI,  K-DECOMP  and  TIMETABLING.  Thus,  the  combined 
criterion,  that  we  proposed  in  this  paper,  seems  to  be  appropriate  for  the  pur¬ 
pose  of  body  reordering.  It  is  worthwhile  noting  that  we  also  tried  a  number 
of  variants  of  this  criterion  for  tuning  the  contribution  of  the  different  factors. 
However,  for  the  considered  examples,  the  formula  described  in  Section  3.3  has 
given  the  better  results,  on  average. 

4.3  The  Enhanced  DLV  Instantiator  vs.  Iparse 

Finally,  we  compared  the  instantiator  of  DLV  against  Iparse,  the  instantiator  of 
Smodels  [14]  -  a  promiment  ASP  system^.  The  newest  version  of  Iparse  (release 
1.0.4, 03-21-2001)  accepts  logic  programs  respecting  extended- domain  restriction. 
This  condition  enforces  each  rule’s  variable  to  occur  in  a  positive  body  literal, 
called  domain  literal,  which  (i)  is  not  mutually  recursive  with  the  head,  and 
(ii)  is  not  unstratified  nor  (transitively)  depends  on  an  unstratified  literal  (see 
Smodels  manual  in  [18]  for  details).  To  instantiate  a  rule  r,  Iparse  employs  a 
nested  loop  scanning  the  extensions  of  the  domain  predicates  occurring  in  its 
body,  and  generates  the  ground  instances  of  r  accordingly  (i.e.,  by  applying  the 
substitutions  obtained  from  the  domain  atoms  and  disregarding  the  substitutions 
violating  either  some  built-in  predicate  or  some  variable  patterns).  Table  2  shows 


^  Since  the  benchmark  programs  are  head- cycle  free  we  could  eliminate  disjunction, 
and  traslate  them  in  the  language  accepted  by  Iparse. 
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a  comparison  between  the  instantiator  of  DLV  (with  the  combined  criterion)  and 
Iparse  (release  1.0.4).  For  both  systems,  we  report  the  time  (CPU+I/0  time) 
they  take  to  instantiate  the  program  and  the  size  (number  of  rules)  of  the  output 
instantiation.  The  symbol  ’  means  that  the  instantiator  did  not  terminate 
within  20  minutes. 

Note  that,  even  if  both  DLV  and  Iparse  compute  ground  programs  that  are 
equivalent  (with  respect  to  the  answer  set  semantics),  the  sizes  of  the  respective 
instantiations  may  differ  significantly.  This  is  due  to  the  different  ways  they  in¬ 
stantiate  a  rule  r:  DLV  computes  the  join  of  the  extensions  of  the  positive  literals 
in  the  body  of  r;  while  Iparse  enumerates  with  a  nested  loop  all  the  extensions  of 
the  domain  predicates  in  a  rule.  Thus,  the  strategy  of  Iparse  is  computationally 
less  expensive  (since  no  join  is  computed)  if  the  cartesian  product  of  these  exten¬ 
sions  is  small,  i.e.,  if  there  are  few  domains  to  scan  or  they  have  small  extensions. 
However,  Iparse  may  produce  an  unusefully  larger  instantiation  than  DLV,  since 
the  rules  generated  by  Iparse  may  contain  non-domain  body  literals  which  are 
certainly  not  derivable  (i.e.,  they  do  not  appear  in  the  head  of  any  rule  of  the  in¬ 
stantiation  having  an  applicable  body).  Indeed,  the  results  in  Table  2  show  that 
Iparse  is  sometimes  very  fast  and  in  fact  faster  than  DLV  (e.g.,  for  K-DECOMP, 
TIMETABLING  and  HAMILTONIAN-PATH,  where  there  are  a  few  domain 
predicates  in  each  rule);  but  the  size  of  the  Iparse ’s  instantiation  is  always  larger 
than  the  size  of  the  instantiation  computed  by  DLV.  The  size  difference  is  rel¬ 
evant  if  several  atoms  (directly  or  transitively)  depend  on  unstratified  literals, 
like,  e.g.,  in  TIMETABLING. 


Table  2.  The  new  instantiator  of  DLV  vs  the  instantiator  of  Smodels 


DLV 

Iparse 

Program 

time 

size 

time 

size 

RAMSEY(3,6)/  17 

8.50 

13,344 

- 

- 

HANOI  [6discs,63steps] 

14.58 

62,413 

- 

- 

CRISTAL 

13.37 

20,978 

- 

- 

K-DECOMP 

29.82 

121,798 

9.81 

123,165 

TIMETABLING 

238.35 

199,551 

88.60 

3,002,700 

HAMILTONIAN-PATH 

1  2.41 

49,674 

1.04 

52,511 

BLOCKSWORLD 

2.99 

46,872 

9.73 

459,706 

CONSTRAINT-3COL 

'  31.64 

7 

805.4 

:  7 

However,  realistic  applications  often  work  on  large  domains  or  require  several 
variables  per  rule.  It  follows  that,  for  meaningful  problems  like,  e.g.,  CRISTAL, 
HANOI  and  RAMSEY,^  Iparse  is  not  able  to  compute  the  instantiation  in  a 

^  HANOI  and  RAMSEY  are  the  benchmark  problems  proposed  at  the  AAAI  Spring 
Symposium  on  ASP  Programming,  March  2001. 
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C.  We  implement  the  above  techniques  in  the  ASP  system  DLV  and  evaluate 
their  efficiency  on  a  number  of  benchmark  problems  taken  from  various  do¬ 
mains.  The  results  of  the  experiments  are  very  positive  and  both  techniques 
prove  to  be  useful.  Moreover,  they  are  orthogonal  and  their  integration  per¬ 
forms  at  least  as  well  as  the  best  individual  technique,  resulting  in  a  relevant 
improvement  of  the  performance  of  the  DLV  system. 

In  addition  to  the  above  contributions,  we  explain  in  detail  the  heuristic 
criterion  adopted  in  DLV  for  the  selection  of  the  branching  literal,  and  the  way 
how  it  is  computed. 

It  is  worthwhile  noting  that  techniques  for  reducing  the  number  of  look¬ 
aheads  have  been  employed  in  SAT  solvers  and  in  other  ASP  systems.  In  par¬ 
ticular,  the  ASP  system  Smodels  makes  a  drastic  pruning  of  the  look-aheads 
by  eliminating  each  literal  which  has  been  derived  during  a  previous  look-ahead 
at  the  same  branch-point:  For  each  literal  B  €  DetCons{I  U  {A}),  the  look¬ 
ahead  for  B  is  not  performed,  because  B  is  guaranteed  to  be  worse  than  A 
w.r.t.  the  heuristic  function  of  Smodels.  This  technique  eliminates  a  higher  num¬ 
ber  of  look-aheads  than  our  technique  described  in  Item  A;  but  our  technique 
is  more  general  and  it  is  applicable  to  a  wider  class  of  heuristic  functions.  In¬ 
deed,  the  technique  of  Smodels  relies  on  a  monotonicity  property  of  the  heuristic: 
DetCons(IU{B})  C  DetCons{I\J{A})  implies  that  B  is  worse  than  A  w.r.t.  the 
heuristic  function  of  Smodels.  Our  technique,  instead,  is  applicable  to  every  cri¬ 
terion  determining  the  heuristic  value  from  the  result  of  the  look-ahead  (i.e.,  the 
heuristic  value  of  A  depends  only  on  DetCons{I  U  {A}).  In  fact,  our  technique 
can  also  be  applied  in  Smodels,  while  the  optimization  employed  by  Smodels 
cannot  be  used  in  DLV,  since  the  heuristic  employed  in  DLV  is  not  monotonic  in 
the  sense  described  above.  A  2-layered  heuristic  similar  to  the  technique  of  Item 
B  above  has  been  successfully  employed  in  the  SAT  solver  SATZ  [LA97]. 

2  Answer  Set  Programming  Language 

In  this  section,  we  provide  a  formal  definition  of  the  syntax  and  semantics  of 
the  ASP  language  supported  by  DLV:  disjunctive  datalog  extended  with  strong 
negation.  For  further  background,  see  [GL91,EFLP00]. 

ASP  Programs  A  (disjunctive)  rule  r  is  a  formula 

ai  V  •••  W  an  biy"  ’  ,bk^  not  6a:+i  ,  •  ♦  • ,  not  bm- 

where  ai,  •  •  • ,  a„,  6i,  •  •  • ,  6m  are  classical  literals  (atoms  possibly  preceded  by  the 
classical  negation  symbol  -i)  and  n>0,m>fc>0.  The  disjunction  ai  V  •  •  •  V 
is  the  head  of  r,  while  the  conjunction  6i,  •  •  • ,  6^,  not  6fc+i,  •  •  •  ,not  bm  is  the 
body,  bi,--  ,bk  the  positive  body,  and  not  •  •  • , not  6m  the  negative  body  of 
r.  The  sets  of  literals  in  head,  body,  positive  body  and  negative  body  of  r ,  are 
denoted  by  H{r),  B{r),  J5+(r),  and  B"(r),  respectively.  Comparison  operators 
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(like  —,<,>,<>)  are  built-in  predicates  in  ASP  systems,  and  may  appear  in 
the  bodies  of  rules.  A  disjunctive  datalog  program  (also  called  ASP  program  in 
this  paper)  V  is  sl  finite  set  of  rules. 

As  usual,  an  object  (atom,  rule,  etc.)  is  called  ground  or  propositional,  if  it 
contains  no  variables. 


Answer  Sets  We  describe  the  semantics  of  consistent  answer  sets,  which  has 
originally  been  defined  in  [GL91]. 

Given  a  program  V,  let  the  Herhrand  Universe  U'p  be  the  set  of  all  constants 
appearing  in  V  and  the  Herbrand  Base  B'p  be  the  set  of  all  possible  combinations 
of  predicate  symbols  appearing  in  V  with  constants  of  U'p,  possibly  preceded 
by 

Given  a  rule  r,  Ground{r)  denotes  the  set  of  rules  obtained  by  applying 
all  possible  substitutions  a  fi-om  the  variables  in  r  to  elements  of  Up.  Simi¬ 
larly,  given  a  program  V,  the  ground  instantiation  Ground{V)  of  V  is  the  set 
UrG'P  GT'ound{r). 

For  every  program  V,  we  define  its  answer  sets  using  its  ground  instantiation 
Ground{'P)  in  two  steps,  following  [Lif96]: 

A  set  L  of  literals  is  said  to  be  consistent  if,  for  every  literal  £  €  L,  its 
complementary  literal  is  not  contained  in  L.  An  interpretation  /  is  a  consistent 
set  of  ground  literals.  An  interpretation  I  C  Bp  is  closed  under  P  (where  7^  is  a 
positive  program),  if,  for  every  r  €  Ground{V),  at  least  one  literal  in  the  head 
is  true  whenever  all  literals  in  the  body  are  true.  I  is  an  answer  set  for  V  if  it  is 
minimal  w.r.t.  set  inclusion  and  closed  under  V. 

The  reduct  or  Gelfond-Lifschitz  transform  of  a  general  ground  program  V 
w.r.t.  an  interpretation  I  is  the  positive  ground  program  ,  obtained  from  V 
by  (i)  deleting  all  rules  r  eV  whose  negative  body  is  false  w.r.t.  7,  (ii)  deleting 
the  negative  body  from  the  remaining  rules.  An  answer  set  of  a  general  program 
P  is  an  interpretation  I  such  that  I  is  an  answer  set  of  Ground{py. 

3  Answer  Set  Computation 

In  this  section,  we  describe  the  main  steps  of  the  computational  process  per¬ 
formed  by  ASP  systems.  We  will  refer  particularly  to  the  computational  engine 
of  the  DLV  system,  but  also  other  ASP  systems,  like  Smodels  employ  a  very 
similar  procedure. 

An  answer  set  program  P  in  general  contains  variables.  The  first  step  of  a 
computation  of  an  ASP  system  eliminates  these  variables,  generating  a  ground 
instantiation  of  7^.^  The  hard  part  of  the  computation  is  then  performed  on  this 
ground  ASP  program. 

The  heart  of  the  computation  is  performed  by  the  Model  Generator,  which  is 
sketched  in  Figure  1.  Roughly,  the  Model  Generator  produces  some  “candidate” 

^  This  ground  instantiation  is  required  to  have  precisely  the  same  answer  sets  as  P, 
and  is  usually  much  smaller  than  Ground(P)  [FLMP99J. 
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reasonable  time  (we  stopped  the  program  after  20  minutes).  In  CONSTRAINT- 
3COL  and  RAMSEY  variables  domains  are  not  large;  however  some  rule  contains 
a  large  number  of  domains,  whose  cartesian  product  is  big  and  slows  down  the 
technique  adopted  by  Iparse."^ 

Moreover,  in  order  to  evaluate  the  quality  of  the  ground  program  produced  by 
the  two  instantiators,  we  are  making  a  number  of  experiments  running  Smod- 
els  on  both  the  ground  programs  produced  by  the  DLV  instantiator  and  by 
Iparse.  Our  preliminary  results  on  the  benchmark  examples  are  very  interest¬ 
ing.  For  instance,  Iparse  is  faster  than  DLV  in  producing  a  ground  program  for 
HAMILTONIAN-PATH.  However,  Smodels  performs  very  bad  with  this  pro¬ 
gram  as  its  input,  while  it  is  very  fast  on  the  ground  program  produced  by  the 
DLV  instantiator. 

Concluding,  the  experiments  confirm  that  the  database  techniques  that  we 
implemented  in  DLV  are  very  useful.  Even  further  techniques  and  results  from 
the  field  of  database  optimization  should  be  carried  out  to  the  area  of  knowledge 
base  systems  to  improve  the  efficiency  of  these  systems. 
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Abstract.  Most  SAT  solvers  and  Answer  Set  Programming  (ASP)  sys¬ 
tems  employ  a  beicktracking  search  by  repeatedly  assuming  the  truth  of 
literals.  The  choice  of  these  branching  literals  is  crucial  for  the  perfor¬ 
mance  of  these  systems. 

Competitive  ASP  systems  employ  advanced  heuristics  to  select  branching 
literals,  which  are  usually  based  on  “look-ahead”  techniques:  To  evaluate 
the  heuristic  value  of  a  literal  A,  truth  and  falsity  of  L  are  assumed  in  the 
current  interpretation,  consequences  are  derived,  and  the  quality  of  the 
resulting  interpretations  is  evaluated.  This  process  can  be  very  expensive, 
and  often  consumes  most  of  the  time  taken  by  an  ASP  system. 

In  this  paper,  we  present  two  techniques  to  optimize  the  computation 
of  the  heuristics  in  the  ASP  system  DLV.  The  first  technique  singles  out 
pairs  of  literals  (A,  not  B)  having  precisely  the  same  consequences,  which 
allows  for  making  only  one  look-ahead  for  each  of  these  pairs.  The  second 
technique  (inspired  by  SAT  solvers)  is  a  2-layered  heuristic,  in  which  a 
simple  heuristic  criterion  reduces  the  set  of  literals  to  be  looked- ahead. 
We  implement  both  techniques  in  the  ASP  system  DLV  and  evaluate 
their  efficiency  on  a  number  of  benchmark  problems  taken  from  various 
domains.  The  experiments  confirm  the  usefulness  of  both  techniques, 
sensibly  improving  the  performance  of  DLV. 


1  Introduction 

DLV  is  a  knowledge  representation  system  based  on  disjunctive  logic  programming 
(DLP)  [Min82,GL91]  offering  front-ends  to  several  Knowledge  Representation 
(KR)  formalisms  [ELM+98b,ELM+98a,EFLP99].  A  strong  point  of  DLV  is  its 
highly  expressive  language,  which  allows  elegant  and  natural  representations 
of  very  hard  problems  (up  to  Z'l^-hard  problems).  DLV  supports  a  declarative 
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mation  (D2I)”. 


T,  Eiter,  W.  Faber,  and  M.  Truszczydski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  295-308,  2001. 
©  Springer- Verlag  Berlin  Heidelberg  2001 


296  Wolfgang  Faber  et  al. 


programming  style  which  has  recently  been  termed  Answer  Set  Programming 
(ASP)i  hence  it  is  referred  to  as  an  ASP  system.  The  idea  of  ASP  is  to  represent 
a  given  computational  problem  by  a  logic  program  whose  answer  sets  correspond 
to  solutions,  and  then  use  an  answer  set  solver  (like  DLV)  to  compute  them  [Lif99]. 

An  efficient  support  for  the  highly  expressive  language  of  DLV  requires  the 
use  of  smart  algorithms  and  data  structures  as  well  as  sophisticated  optimization 
techniques  in  order  to  deal  with  such  hard  computational  tasks. 

DLV  employs  backtracking  search  by  repeatedly  assuming  the  truth  of  lit¬ 
erals  [FLP99],  and  in  order  to  improve  the  eflSciency  of  the  DLV  system,  in 
a  previous  paper  [FLPOl]  we  have  experimented  with  a  number  of  heuristics 
for  deciding  which  branching  literal  to  assume.  These  heuristics  are  based  on 
“look-ahead”  techniques:  to  evaluate  the  heuristic  value  of  a  literal  L  w.r.t.  the 
interpretation  I  at  hand,  truth  and  falsity  of  L  are  assumed  in  the  current 
interpretation,  and  its  consequences  are  derived  by  computing  its  determin¬ 
istic  extensions  P  —  DetC(ms{I  U  {Z^})  (the  interpretation  P  is  guaranteed 
to  be  contained  precisely  in  the  same  answer  sets  containing  I  U  {Z/})  and 
P'  =  DetCons{I  U  {not  L}).  Note  that  P  and  can  be  inconsistent,  in  which 
case  the  search  space  can  be  pruned  early. 

The  heuristic  value  of  Z/  is  a  measure  of  the  “quality”  of  the  resulting  inter¬ 
pretations  P  and  Z".  Some  of  these  heuristics  proved  to  be  very  useful,  as  they 
drastically  reduce  the  number  of  choice-points  arising  in  an  ASP  computation. 
However,  the  computation  of  these  heuristics  is  very  expensive,  since  the  num¬ 
ber  of  literals  to  be  “looked- ahead”  is  very  large  in  some  CEises,  and  the  cost  of 
a  look-ahead  is  linear  in  the  size  of  the  Herbrand  Base  in  the  worst  case.  The 
computation  of  the  heuristics  thus  often  consumes  most  of  the  total  time  taken 
by  an  ASP  system,  and  may  slow  down  the  ASP  system  significantly. 

In  this  paper,  we  try  to  reduce  the  amount  of  time  needed  to  evaluate  the 
heuristics,  by  reducing  the  number  of  look-aheads  that  need  to  be  performed. 
The  main  contributions  of  the  paper  are  the  following: 

A.  We  define  a  new  condition  which  is  sufficient  to  guarantee  that,  at  a  given 
stage  of  the  computation,  two  literals  {A,  not  B)  have  precisely  the  same 
set  of  deterministic  consequences  w.r.t.  the  interpretation  Z  at  hand,  that  is, 
DetCons{I  U  {A})  =  DetCons{I  U  {not  B}).  Consequently,  A  and  not  B 
are  guaranteed  to  have  precisely  the  same  heuristic  values,  and  we  avoid  the 
look-ahead  for  one  of  them. 

This  technique  allows  us  to  save  50%  of  the  look-aheads  in  several  cases 
including,  e.g.,  Hamiltonian  Path  and  3SAT  programs. 

B.  We  design  a  2-layered  heuristic.  A  computationally  cheap  heuristic  criterion 
reduces  the  set  of  literals  to  be  considered,  and  the  look-ahead  to  select 
the  branching  literal  is  applied  only  to  the  literals  in  this  set.  This  method 
significantly  reduces  the  number  of  look-aheads,  but,  unlike  the  previous 
technique,  it  is  not  an  “exact”  method,  that  is,  it  might  exclude  literals 
which  would  otherwise  have  had  high  heuristic  values.  Also  some  literals  for 
which  the  look-ahead  detects  inconsistency  can  be  missed  in  this  way,  so 
there  will  be  less  pruning  in  general. 
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Forward  Function  DetCons(I:  Interpretation):  Interpretation; 

(*  Extends  I  with  the  literals  that  can  be  deterministically  inferred  and  returns  the 
resulting  interpretation  or  the  set  of  all  literals  C  upon  inconsistency.  [FLP99]  *) 
Forward  Procedure  Select (var  I:  Interpretation,  var  L:  ClassicalLiteral); 

(*  Selects  the  classical  literal  L  having  the  highest  heuristic  value  (see  Section  4  *) 

Function  ModelGenerator(var  I:  Interpretation):  Boolean; 

(*  The  function  returns  True  iff  J  can  be  extended  to  an  answer  set.  *) 
var  inconsistency:  Boolean; 
begin 

I  :=  DetCons(I); 

if  I  =  £  then  return  False;  (*  inconsistency  detected  *) 
if  no  literal  is  undefined  in  I  then  return  IsAnswerSet(I); 

Select  (I, L); 

if  ModelGenerator(/  U  {L})  then  return  True; 
else  return  ModelGenerator(7  U  {not  L}); 

end; 

Fig.  1.  Computation  of  Answer  Sets 


answer  sets.  The  stability  of  each  of  these  is  subsequently  verified  by  the  function 
IsAnswerSet(I),  which  checks  whether  the  given  “candidate”  7  is  a  minimal 
model  of  the  program  Ground{Vy  ^  the  reduct  of  Ground{V)  w.r.t.  7. 

The  ModelGenerator  function  is  first  invoked  with  parameter  7  set  to  the 
empty  interpretation.^  If  the  program  V  has  an  answer  set,  the  function  returns 
True,  setting  7  to  that  answer  set;  otherwise  it  returns  False.  The  Model  Gen¬ 
erator  is  similar  to  the  Davis-Putnam  procedure  employed  by  SAT  solvers.  It 
first  calls  the  function  DetCons,  which  extends  7  with  those  literals  that  can 
be  deterministically  inferred  from  7.  DetCons  is  similar  to  a  unit  propagation 
procedure  employed  by  SAT  solvers,  but  exploits  the  peculiarities  of  ASP  for 
making  further  inferences  (e.g.,  it  exploits  the  knowledge  that  every  answer  set 
is  a  minimal  model).  If  DetCons  does  not  detect  any  inconsistency,  a  classical 
literal  L  is  selected  according  to  a  heuristic  criterion  by  a  call  to  the  Select  pro¬ 
cedure.  ModelGenerator  is  then  recursively  called  on  7  U  {T};  if  this  call  does 
not  generate  an  answer  set  (i.e.,  7U{L}  is  not  contained  in  any  answer  set),  it  is 
called  on  7U{not  L}.  The  classical  literal  L  plays  the  role  of  a  branching  variable 
of  a  SAT  solver.  And  indeed  the  selection  of  a  “good”  literal  L  is  crucial  for  the 
performance  of  an  ASP  system.  In  the  next  section,  we  describe  the  heuristic 
criterion  adopted  by  DLV  for  the  selection  of  such  branching  literals,  and  how 
Select  is  implemented. 

^  Observe  that  the  interpretations  built  during  the  computation  are  3- valued  and  an 
interpretation  7  is  a  set  of  ground  literals.  A  ground  classical  literal  A  is  True  (resp. 
False)  w.r.t.  to  7  if  A  €  7  (resp.  not  A  £  I);  otherwise  A  is  Undefined  w.r.t.  7. 
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4  Evaluation  of  the  Heuristic  Function 

In  this  section,  we  define  the  heuristic  criterion  adopted  in  the  DLV  system  and 
we  describe  how  it  is  evaluated. 

The  heuristics  of  DLV  is  a  “dynamic  heuristics”  (the  ASP  equivalent  of  UP 
heuristics  for  SAT),  that  is,  the  heuristic  value  of  a  literal  Q  depends  on  the  result 
of  taking  Q  true  as  well  as  false  and  computing  its  consequences,  respectively.  In 
order  to  reduce  the  number  of  look-aheads,  the  DLV  system  does  not  evaluate  the 
heuristic  value  of  all  undefined  classical  literals;  rather,  it  considers  only  a  subset 
of  the  undefined  classical  literals,  called  possibly-true  literals.  The  correctness  of 
this  strategy,  adopted  since  the  first  release  of  DLV,  is  shown  in  [LRS97]. 

Definition  1,  A  Possibly-True  (PT)  literal  of  V  w.r.t.  an  interpretation  I  is 
an  undefined  classical  literal  p  such  that  there  exists  a  rule  r  €  Ground{V)  for 
which  all  of  the  following  conditions  hold: 

1.  p  is  in  the  head  of  r:  p  e  H{r)\ 

2.  the  head  of  r  is  not  true  w.r.t.  I:  H(r)(M  — 

3.  the  positive  body  of  r  is  true  w.r.t.  7:  B^{r)  C  (7); 

4.  the  negative  body  of  r  is  not  false  w.r.t.  7:  7  fl  {a  :  not  a  G  B~{r)}  ~  0. 

The  set  of  all  PT  literals  of  V  w.r.t.  7  is  denoted  by  P7V(7).  □ 

Example  1.  Consider  the  program  P  =  {a V6  :  -  c.  d  :  -  not  a,  not  /.  e V/ :  -k,} 
and  let  7  =  {c}  be  an  interpretation  for  P,  then  Pr'p(7)  =  {a,  6,  d}. 

As  shown  in  Figure  2  (initial  foreach  statement),  DLV’s  heuristic  function  is 
evaluated  only  on  the  PT  literals.  It  is  worthwhile  noting,  however,  that  the  PT 
literals  do  not  always  restrict  the  set  of  classical  literals  to  be  looked-ahead,  since 
all  undefined  literals  are  PT  in  some  cases.  For  instance,  in  the  program  encoding 
3SAT  (see  Section  7.1)  every  undefined  literal  is  a  PT  literal,  as  it  occurs  in 
the  head  of  a  rule  having  a  true  (empty)  body.  In  contrast,  in  the  program 
HAMPATH,  at  a  given  stage  of  the  computation,  the  PTs  are  only  those  literals 
of  the  form  inPath{a,b)  or  outPath{a^b\  where  a  is  a  node  already  reached 
from  the  start  {reachedia)  is  True)  and  (a,  b)  is  an  arc  of  the  input  graph. 

Let  us  now  turn  our  attention  to  the  heuristic  criterion  adopted  in  DLV  to 
choose  the  “best”  among  the  PT  literals. 

A  peculiar  property  of  answer  sets  is  supportedness:  For  each  true  classical 
literal  A  in  an  answer  set  7,  there  exists  a  rule  r  of  the  program  such  that  the 
body  of  r  is  true  w.r.t.  7  and  A  is  the  only  true  literal  in  the  head  of  r  (r  is  then 
called  a  supporting  rule  for  A).  Since  an  ASP  system  must  eventually  converge 
to  a  supported  interpretation,  ASP  systems  try  to  keep  the  interpretations  “as 
much  supported  as  possible”  during  the  intermediate  steps  of  the  computation. 
To  this  end,  the  DLV  system  counts  the  number  of  UnsupportedTrue  (UT)  literals, 
i.e.,  classical  literals  which  are  true  in  the  current  interpretation  but  still  miss 
a  supporting  rule  (in  [FLP99]  UTs,  called  MBTs  there,  are  discussed  in  detail). 
For  instance,  the  rule  not  x  implies  that  x  must  be  true  in  each  answer  set  of 
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Procedure  Select (var  I:  Interpretation,  var  L:  ClassicalLiteral); 

var  I J •  Interpretation; 

begin 

L  :=  NULL] 
foreach  A  G  PTr>{I)  do 

1+  :=  DetCons(/  U  {A})]  (*  look-ahead  for  A  *) 

if  Ij  =  >C  then  /  :=  /  U  {not  A}] 

else  I^  ;=  DetCons(/  U  {not  A});  (*  look-ahead  for  not  A  *) 
if  I^  =  £  then  I I U  {A};  endif 
endif 

if  ij  ^  £  and  C  then  (*  no  inconsistency  has  arisen  *) 

if  L  =  NULL  then  L  :=  A\  (*  first  literal,  no  comparison  *) 

(*  compare  A  against  L  w.r.t.  the  heuristic;  *) 
elseif  (  C/T(I+)  -f  UT{ll)  )  <  (  17T(I+)  UT{IZ)  )  then  L  :=  A; 
elseif  (  UT2(lt)  +  UT2(IX)  )  <  (  17T2(I+)  +  UT2(I£)  )  then  L  :=  A; 
elseif  (  UT3(J+)  +  UTail^)  )  <  (  UTaili)  +  (7X3(1^)  )  then  L  :=  A; 
elseif  (  US(li)  +  US(1X)  )  <  (  US(ll)  +  USil^)  )  then  L  :=  A-, 

endfor 

end; 

Fig.  2.  Selection  of  the  Branching  Literal  by  DLV’s  Heuristic 


the  program,  but  it  does  not  give  a  “support”  for  x.  Thus,  in  the  DLV  system  x 
is  assumed  true  in  the  current  interpretation  to  satisfy  that  rule,  and  it  is  added 
to  the  set  of  UnsupportedTrue  literals;  it  will  be  removed  from  this  set  once  a 
supporting  rule  for  x  will  be  found  (e.g.,  xV  b:-c  is  a  supporting  rule  for  x  in 
the  interpretation  I  =  {x,not  6,  c}). 

Intuitively,  since  the  set  of  UnsupportedTrue  literals  must  eventually  be 
empty  when  an  answer  set  is  reached,  the  heuristic  of  DLV  tries  to  minimize 
the  number  of  UT  literals,  taking  particular  care  of  those  UT  literals  which  are 
more  “in  danger”  (an  UT  literal  appearing  in  the  head  of  fewer  rules  is  more  in 
danger  than  a  literal  appearing  in  the  head  of  many  rules). 

Given  an  interpretation  7,  let  UT{I)  be  the  number  of  UT  literals  in  I.  More¬ 
over,  let  UT2{I)  and  UTz{I)  be  the  number  of  UT  literals  occurring,  respectively, 
in  the  heads  of  2  and  3  rules  (which  are  not  already  satisfied  w.r.t.  7,  and  can 
therefore  be  potentially  used  to  support  the  UT  literal).^  The  heuristic  of  DLV 
considers  UT{I),  UT2{I)  and  UTs{I)  in  a  prioritized  way  to  favor  literals  yield¬ 
ing  interpretations  with  fewer  UT/UT2/UTZ  literals  (which  should  more  likely 
lead  to  a  supported  model).  If  all  UT  counters  are  equal,  then  the  heuristic 
minimizes  US{I)  the  number  of  unsatisfied  rules  w.r.t.  7. 

Since  the  failure  of  the  computation  branch  selecting  A  True  starts  a  new 
branch  assuming  not  A  (see  last  instruction  in  Figure  1),  the  heuristic  criterion 
considers  the  effect  of  choosing  a  literal  A  and  its  complement  not  A  in  a  bal¬ 
anced  way.  To  this  end,  the  counters  [/T(lJ),  UT2(lJ),  ?7T3(I^),  and  C/5(lJ), 


^  UTi  literals  do  not  exist  in  DLV  computations.  Whenever  a  rule  r  is  the  last  poten¬ 
tially  supporting  rule  for  an  UT  literal  A,  then  A  is  inferred  via  r  (see  [FLP99]). 
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resulting  from  the  look-ahead  on  /  U  {A},  are  ordinately  added  to  the  coun¬ 
ters  UT{T^)y  C/T2(I^),  i7T3(I^),  and  t/5(I^),  resulting  from  the  look-ahead  on 
I U  {not  A},  when  evaluating  the  heuristics. 

5  Look-Ahead  Equivalences 

Dynamic  heuristics  vary  only  in  the  interpretation  resulting  from  the  function 
DeiCons(lJ)  (resp.  DetCons{l]^)).  It  is  therefore  interesting  to  identify  cases 
where  two  literals  L  and  L'  are  look-ahead  equivalent^  i.e.,  DetCons{Ii)  = 
DetCons{lL'),  since  one  of  the  two  look-ahead  computations  could  be  saved. 
This  notion  of  equivalence  is  formalized  next. 

Definition  2.  Let  p  and  q  be  two  undefined  literals  w.r.t.  an  interpretation  I. 
p  and  q  are  look-ahead  equivalent  if  DetCons{I  U  {p})  =  DetCons{I  U  {g})- 

To  single  out  a  sufficient  and  efficiently  checkable  condition  which  guarantees 
such  an  equivalence,  we  first  define  the  notion  of  a  potentially  supporting  rule: 

Definition  3.  Given  a  program  “P,  a  classical  literal  a,  and  a  (3-valued)  inter¬ 
pretation  /,  a  rule  r  G  P  is  a  potentially  supporting  rule  for  a  w.r.t.  /,  if  the 
following  conditions  are  satisfied:  (i)  a  occurs  in  the  head  of  r,  (ii)  no  literal  in 
H (r)  —  {a}  is  true  w.r.t.  /,  and  (hi)  no  literal  in  the  body  of  r  is  false  w.r.t.  I. 
Let  psuppp(a,  I)  denote  the  number  of  potentially  supporting  rules  for  a. 

We  can  now  formulate  the  following: 

Proposition  1.  If  two  undefined  classical  literals  a  and  b  occur  in  the  head  of 
a  rule  r  in  a  program  P,  and  a  and  b  are  the  only  undefined  literals  w.r.t.  an 
interpretation  I  in  r  (where  we  assume  that  there  is  no  multiple  occurrence  of 
classical  literals  in  rules),  then  it  holds  that: 

1.  If  psupp'p{b,I)  =  1,  then  a  and  not  b  are  look-ahead  equivalent. 

2.  If  psupp'p {a,  I)  =  I,  then  not  a  and  h  are  look-ahead  equivalent. 

Proof.  (Sketch)  Suppose  psupp'p{b,  I)  —  1.  Then  r  is  the  only  rule  in  P 
which  might  derive  b.  Since  the  body  of  r  is  already  true  in  7,  such  a  derivation 
is  performed  iff  a  becomes  false.  Therefore,  DetCons{I)  either  contains  both  a 
and  not  b  or  it  contains  none  of  them.  A  symmetric  argument  shows  item  2.  □ 

Example  2.  Consider  the  program  {a  V  6.}  and  7  =  0.  Both  a  and  b  are  PT 
literals,  so  look-ahead  for  a,  not  a,  6,  and  not  b  is  performed,  i.e.  we  compute 
DetCons{{a})  =  (a,  not  6},  DetCons{{iLOt  a})  —  (not  a,  6},  DetCons{{b})  = 
{not  a,  6},  and  DetCons{{not  6})  =  {a,  not  b}.  In  this  example  we  can  save  the 
look-aheads  for  not  b  and  b  because  of  proposition  1,  and  thus  save  half  of  the 
look-aheads. 

In  DLV  computations,  we  can  recognize  the  applicability  of  Proposition  1  very 
efficiently  and  avoid  extraneous  look-aheads.  Experimental  results  reported  in 
Section  8  will  show  that  we  avoid  up  to  50%  of  look-aheads  in  some  cases  (e.g. 
on  3SAT)  by  exploiting  this  simple  condition. 
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6  2-Layered  Heuristic 

In  [LA97]  a  different  idea  on  reducing  look-aheads  is  presented:  An  easy-to- 
compute  heuristics  is  defined  as  a  first  layer,  and  look-ahead  is  only  computed 
on  those  possible  choices  which  look  promising  w.r.t.  to  this  easier  heuristics. 
This  gives  a  kind  of  2-layered  heuristic. 

The  simple  heuristic  criteria  defined  in  [LA97]  involve  the  number  of  binary 
clauses  a  classical  literal  occurs  in.  The  rationale  is  that  this  is  the  number  of 
immediate  propagations  that  can  be  performed  during  the  look-ahead.  This  idea 
can  be  directly  transferred  to  our  ASP  framework: 

Definition  4.  A  binary  clause  is  a  rule  which  contains  exactly  two  undefined 
classical  literals  w.r.t.  an  interpretation  I.  The  number  of  binary  occurrences  of 
an  undefined  literal  a  is  the  number  of  binary  clauses  a  occurs  in. 

Note  that  this  notion  directly  corresponds  to  the  number  of  immediate  prop¬ 
agations  which  can  be  performed  by  assuming  a  and  not  a,  so  it  matches  the 
intuition  of  [LA97].  To  reduce  the  number  of  literals  to  be  looked-ahead,  we 
adopt  the  following  criterion: 

First-Layer  Heuristic  Sbin^  Let  PT'p(/)  be  the  set  of  PT  literals  of  a  program  V 
w.r.t.  an  interpretation  /,  and  let  Sun  Q  PT'p{I)  be  the  set  of  PT  literals  having 
more  than  the  average  number  of  binary  occurrences  w.r.t.  all  literals  in  PTp(I). 
Then,  consider  only  the  literals  in  Sbin  for  the  selection  of  the  branching  literals 
(i.e.,  make  look-ahead  only  on  these  literals). 

Note  that  our  first-layer  heuristics  is  inspired  by  the  same  intuition  as  the 
first-layer  heuristics  in  [LA97],  even  though  it  is  not  precisely  the  same. 

7  Benchmarks 

7.1  Benchmark  Programs 

To  evaluate  the  impact  of  the  two  optimization  techniques  presented  in  the 
previous  sections,  we  chose  a  couple  of  benchmark  problems:  3SAT,  Blocksworld 
Planning,  Hamiltonian  Path,  and  Strategic  Companies. 

3SAT  is  one  of  the  best  researched  problems  in  AI  and  generally  used  for  solving 
many  other  problems  by  translating  them  to  3 SAT,  solving  the  3SAT  problem, 
and  transforming  the  solution  back  to  the  original  domain: 

Let  be  a  propositional  formula  in  conjunctive  normal  form  (CNF)  — 
Ar=i(^*.i  V  ...  V  di^s)  where  the  dij  are  literals  over  the  propositional  vari¬ 
ables  a;i, . . .  yXm-  ^  is  satisfiable,  iff  there  exists  a  consistent  conjunction  I 
of  literals  such  that  7  |= 

3SAT  is  a  classical  NP-complete  problem  and  can  be  easily  represented  in 
our  formalism  as  follows:  For  each  propositional  variable  Xi  (I  <i  <m),  we  add 
the  following  rule  which  ensures  that  we  either  assume  that  variable  xi  or  its 
complement  nxi  true:  Xi  V  nxi.  For  each  clause  di  V  . . .  V  ds  in  ^  we  add 
the  constraint  not  di, . . .  ,not  ds.  where  di  (1  <  z  <  3)  is  Xj  if  di  is  a 
positive  literal  Xj ,  and  nxj  if  di  is  a  negative  literal  -^Xj . 
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Hamiltonian  Path  (HAMPATH)  is  another  classical  NP-complete  problem  from 
the  area  of  graph  theory: 

Given  an  undirected  graph  G  =  {V^E)^  where  V  is  the  set  of  vertices  of  G 
and  E  is  the  set  of  edges,  and  a  node  a  G  V  of  this  graph,  does  there  exist  a 
path  of  G  starting  at  a  and  passing  through  each  node  in  V  exactly  once? 

Suppose  that  the  graph  G  is  specified  by  using  two  predicates  node{X)  and 
arc{X,  F)^,  and  the  starting  node  is  specified  by  the  unary  predicate  start  which 
contains  only  a  single  tuple.  Then,  the  following  program  solves  the  problem 
HAMPATH. 

%  Each  node  has  to  be  reached. 

:  -node{X),  not  reached{X).  reached(X) :  -start{X).  reached{X) :  -inPath{Y,  X). 

%  Guess  whether  to  take  a  path  or  not. 

inPath{X,  Y)  V  outPath{X,  Y) :  -reached{X),arc{X,  Y). 

%  At  most  one  incoming/out  going  arc! 

:  -inPath(X,  Y),inPath(X,  Z),  YoZ.  :  -inPath{X,  F),  inPath{Z,  F),  XoZ. 

Blocksworld  (BW)  is  a  classic  problem  from  the  planning  domain,  and  one  of 
the  oldest  problems  in  AL 

Given  a  table  and  a  number  of  blocks  in  a  (known)  initial  state  and  a  desired 
goal  state,  try  to  reach  that  goal  state  by  moving  one  block  at  a  time,  such 
that  only  unoccupied  blocks  are  moved  on  top  of  other  unoccupied  blocks 
or  the  table. 

Figure  3  shows  a  simple  example  that  can  be  solved  in  three  time  steps:  First 
we  move  block  c  to  the  table,  then  block  6  on  top  of  a,  and  finally  c  on  top  of  b. 
Due  to  space  restrictions  we  refer  to  [Erd99,FLMP99]  for  complete  encodings. 


initial:  goal; 

m  I  I 

Fig.  3.  Simple  BW  Example 


Strategic  Companies  (STRATCOMP)  finally,  is  a  Z’^^-complete  problem,  which 
has  been  first  described  in  [CEG97]: 

A  holding  owns  companies  C(1 ),...,  (7(c),  each  of  which  produces  some 
goods.  Some  of  these  companies  may  jointly  control  another  one.  This  is 
modeled  by  means  of  predicates  prod(P,  Cl,  C2)  —  product  P  is  produced 
by  companies  Cl  and  C2  —  and  contr(C,  Cl,  C2,  C3)  —  company  C  is 
jointly  controlled  by  Cl,  C2  and  C3. 

Predicate  arc  is  symmetric,  since  undirected  arcs  are  bidirectional. 
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Now,  some  companies  should  be  sold,  under  the  constraint  that  all  goods  can 
still  be  produced,  and  that  no  company  is  sold  which  would  still  be  controlled 
by  the  holding  afterwards.  A  company  is  strategic,  if  it  belongs  to  a  strategic 
set,  which  is  a  minimal  set  of  companies  satisfying  these  constraints. 

The  answer  sets  of  the  following  natural  program  correspond  one  to  one  to 
the  strategic  sets.  Checking  whether  any  given  company  C  is  strategic  is  done 
by  brave  reasoning:  “Is  there  any  answer  set  containing  C?” 

strategic{Cl)  V  strategic{C2)  prod(P,  d,(72). 

strategic{C)  :  -  contr{C,  (71,  C2,  (73),  strategic{Cl),strategic(C2),  strategic{C3). 

As  in  [CEG97]  we  assume  that  each  product  is  produced  by  at  most  two 
companies  and  each  company  is  jointly  controlled  by  at  most  three  companies. 

7.2  Benchmark  Data 

For  3SAT,  we  have  randomly  generated  3CNF  formulas  over  n  variables  (where  n 
denotes  the  size  as  plotted  on  the  x-axis  of  the  graphs  in  Section  8)  using  a  tool 
by  Selman  and  Kautz  [SK97].  For  each  size  we  generated  8  instances,  where  we 
kept  the  ratio  between  the  number  of  clauses  and  the  number  of  variables  near 
the  cross-over  point  of  4.3. 

The  instances  for  HAMPATH  were  generated  by  a  tool  by  Patrik  Simons 
which  has  been  used  to  compare  Smodels  against  SAT  solvers  (cf.  [SimOO])^. 
Again,  for  each  problem  size  n  we  generated  8  instances,  always  assuming  node 
1  as  the  starting  node. 

The  blocksworld  problems  P3  and  P4  have  been  employed  in  [Erd99]  to 
compare  ASP  systems,  and  can  be  solved  in  8  and  9  steps,  respectively.  We  aug¬ 
mented  these  by  problem  P5  which  requires  11  steps.  For  each  of  these  problems, 
we  generated  8  random  permutations  of  the  input. 

For  STRATCOMP,  finally,  we  randomly  generated  8  instances  for  each  prob¬ 
lem  size  n,  with  n  companies  and  n  products.  Each  company  O  is  controlled  by 
one  to  five  companies  (two  groups  of  companies,  where  each  of  these  groups  con¬ 
trols  the  same  company  O,  must  have  at  least  one  member  in  common),  where 
the  actual  number  of  companies  is  uniform  randomly  chosen.  On  average  this 
results  in  1.5  contr  relations  per  company. 

All  experiments  were  performed  on  a  Pentium  III/733  machine  with  256MB 
of  main  memory  running  GNU /Linux.  The  binaries  were  generated  with  GCC 
2.95.2.  The  input  files  used  for  the  benchmarks  are  available  on  the  web  at 
http : //WWW . dbai . tuwien . ac . at/pro j /dlv/lpnmrOl . tar . gz. 

8  Experimental  Results  and  Conclusion 

The  results  of  our  experiments  are  displayed  in  the  graphs  of  Figures  4-7.  For 
each  problem  domain  we  report  two  graphs:  In  both  graphs  the  horizontal  axis 

®  available  at  http://tcs.hut . f i/Sof tware/smodels/misc/hamiltoii. tar . gz 
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no  opt. 


opt  1 


opt  2 

|o— a— e| 


opt  1+2 


Fig.  4.  3SAT  problems,  average  running  times  and  look-aheads 


reports  a  parameter  representing  the  size  of  the  instance,  while  on  the  vertical 
axis  we  report  the  running  time  (expressed  in  seconds)  and  the  number  of  look¬ 
aheads,  respectively,  averaged  over  the  8  instances  of  the  same  size  we  have 
run  (see  previous  section).  The  curves  labeled  by  “no  opt.”,  “opt.  1”,  “opt.  2”, 
and  “opt.  1+2”,  denote,  respectively,  the  initial  (unoptimized)  version,  the  look¬ 
ahead  equivalence  optimization,  the  2-layered  optimization,  and  the  combination 
of  both  look-ahead  equivalence  and  2-layered  optimization. 


no  opt. 


opt.  1 

EE3 


opt.  2 


opt.  1+2 


no  opt. 


opt.  1 

opt.  2 

opt.  1+2 

{h-o-c| 


Fig.  5.  Blocksworld  problems,  average  running  times  and  look-aheads 


Observe  first  that  both  optimizations  always  bring  some  gain  over  the  original 
version,  as  the  “no  optimization”  curve  is  always  on  top  of  the  other  three  curves 
in  all  graphs. 

The  two  optimizations  have  different  impact,  depending  on  the  problem  do¬ 
main:  For  Blocksworld,  the  equivalence  optimization  performs  better  than  the 
2-layered  approach,  while  for  Strategic  Companies  and  3SAT  the  opposite  holds. 
For  Hamiltonian  Path  both  optimizations  behave  roughly  equal. 
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Fig.  6.  Strategic  Companies,  average  running  times  and  look-aheads 


no  opt. 


opt.  2 

|o— 9— e] 

opt.  1+2 


no  opt. 


opt.  2 

opt.  1+2 


Fig.  7.  Hamiltonian  Path  problems,  average  running  times  and  look-aheads 


The  combination  of  the  two  optimizations  combines  the  benefits  in  the  sense 
that  performance  is  always  as  good  as  for  the  best  of  the  two  strategies.  Indeed, 
the  curve  combining  the  two  strategies  (opt.  1+2)  often  nearly  coincides  with 
the  curve  of  the  best  of  opt.l  and  opt.2,  e.g.  for  Blocksworld  opt.1+2  and  opt.l 
are  almost  equal,  while  for  Strategic  Companies  opt.1+2  and  opt.2  coincide.  On 
Hamiltonian  Path  opt.l,  opt.2,  and  opt.1+2  all  give  the  same  speed-up.  Finally, 
in  the  case  of  3 SAT  there  are  even  better  results  for  opt.1+2  than  for  any  of  the 
two  methods  alone. 

Note  that  for  opt.2  (and  opt.1+2),  the  runtime  and  the  number  of  look¬ 
aheads  need  not  correlate,  as  fewer  look-aheads  are  performed  but  the  quality 
of  the  PTs  may  be  worse,  which  may  lead  to  larger  trees.  For  opt.l  the  choices 
remain  the  same,  but  only  the  number  of  look-aheads  can  be  reduced,  so  avoided 
look-aheads  directly  reduce  the  runtime  in  this  case. 

Thus,  both  optimizations  turned  out  to  be  useful,  and  we  have  incorporated 
their  combination  in  the  version  of  DLV  released  in  June  2001.  We  believe  that 
this  is  a  promising  way  towards  the  improvement  of  ASP  systems  that  should  be 
subject  of  further  investigation.  Indeed,  besides  optimizing  the  implementation 
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of  the  techniques  proposed  in  this  paper,  we  have  already  planned  future  work 
to  explore  other  promising  ways  to  reduce  the  number  of  look-aheads. 
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Abstract.  Default  Logic  is  recognized  as  a  powerful  framework  for 
knowledge  representation  and  incomplete  information  management.  Its 
expressive  power  is  suitable  for  non  monotonic  reasoning,  but  the  coun¬ 
terpart  is  its  very  high  level  of  computational  complexity.  The  purpose 
of  this  paper  is  to  show  how  heuristics  such  as  Genetic  Algorithms,  Ant 
Colony  Optimization  and  Local  Search  can  be  used  to  elaborate  an  effi¬ 
cient  non  monotonic  reasoning  system. 


1  Introduction 

People  are  often  used  to  manage  and  reason  from  incomplete  information.  Every 
day  they  make  decisions  without  knowing  every  aspect  of  their  environment.  In 
many  cases,  this  type  of  rough  reasoning,  based  on  natural  and  intuitive  knowl¬ 
edge  approximations,  appears  easier  and  more  efficient  than  applying  formal 
logical  or  mathematical  deduction  systems.  From  these  remarks,  one  could  ex¬ 
pect  that  an  Artificial  Intelligence  system  would  be  easy  to  conceive  and  would 
be  very  efficient.  Unfortunately,  this  is  not  the  case.  Twenty  years  ago  [14]  stated 
the  foundations  of  Default  Logic  which  is  nowadays  recognized  as  one  of  the  best 
frameworks  to  capture  and  formalize  common  sense  reasoning  from  incomplete 
information.  Default  Logic  provides  a  representation  of  non  completely  specified 
rules  by  means  of  rules  with  exceptions  and  defines  a  deduction  mechanism  to 
get  conclusions  even  if  some  data  are  not  available.  Unfortunately,  this  approach 
has  a  very  high  theoretical  level  of  complexity.  As  a  matter  of  fact,  computing  a 
set  of  plausible  conclusions  (called  an  extension)  of  a  finite  propositional  default 
theory  is  -  complete  [5].  The  difference  in  performances  between  human  and 
artificial  approaches  relies  on  the  fact  that  human  reasoning  can  avoid  many 
verifications  while  default  logic  builds  a  set  of  coherent  conclusions  and  discards 
some  kind  of  inconsistencies. 

Previous  works  [11,2]  have  already  investigated  this  computational  aspect  of 
default  logic  and  even  if  some  systems  have  good  performances  on  certain  classes 
of  default  theories,  there  is  no  efficient  system  for  general  extension  calculus.  Due 
to  this  computational  complexity,  a  deterministic  method  based  on  the  whole 
exploration  of  the  search  space  would  not  be  efficient  for  non  trivial  theories, 
even  if  it  uses  some  sophisticated  pruning  methods. 


T.  Eiter,  W,  Faber,  and  M.  Truszczyriski  (Eds.);  LPNMR  2001,  LNAI  2173,  pp.  309-321,  2001. 
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We  argue  that  non  deterministic  approaches  [10,13]  can  be  (on  average)  more 
efficient,  in  spite  of  their  incompleteness. 


Fig.  1.  GA,  AGO  +  LS  systems  for  Default  Logic 


In  this  paper,  we  present  different  approaches,  sometimes  complementary, 
that  we  have  implemented  in  operational  systems  able  to  perform  default  rea¬ 
soning  on  non  trivial  knowledge  bases.  The  purpose  of  our  different  algorithms 
is  to  progressively  improve  a  given  initial  configuration  in  order  to  reach  a  so¬ 
lution.  Three  general  approaches  are  considered  here.  Genetic  Algorithms  are 
based  on  the  principles  of  natural  selection.  Populations  of  possible  solutions 
evolve  through  a  process  of  mutation  and  crossover  in  order  to  generate  better 
and  better  configurations.  Ant  Colony  Optimization  is  inspired  by  the  observa¬ 
tion  of  the  collective  behavior  of  ants  when  they  are  seeking  food.  Its  goal  is  to 
find  an  optimal  path  in  a  graph  encoding  the  problem  to  solve.  At  last,  Local 
Search  relies  on  an  incremental  improvement  of  a  potential  solution  to  a  given 
problem  by  local  moves  from  a  configuration  to  its  neighbors.  Local  Search  will 
be  used  here  as  an  additional  optimization  mechanism  and  combined  with  pre¬ 
vious  methods.  The  general  architecture  of  our  system  is  summarized  in  figure  1 
and  detained  in  sections  3,  4  and  5. 
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2  Extension  Computation  in  Default  Logic 


In  Default  Logic  [14]  knowledge  is  represented  by  a  default  theory  {W^  D)  where 
W  contains  the  safe  knowledge  (in  this  work  it  is  a  finite  set  of  propositional 
formulas)  and  D  is  a  finite  set  of  default  rules  (or  defaults).  A  default  S  = 
is  an  inference  rule  (a,  7  and  all  j3i  are  propositional  formulas)  whose 
meaning  is  “if  the  prerequisite  a  is  proved,  and  if  for  all  i  =  l,...,n  each 
justification  Pi  is  individually  consistent  (in  other  words  if  nothing  proves  its 
negation)  then  one  concludes  the  consequent  7^”. 

Given  a  default  theory,  Reiter  has  defined  a  set  of  its  plausible  conclusions, 
named  an  extension^  by  means  of  a  fixpoint  operator.  In  addition,  he  has  given 
the  following  result; 

Theorem  1.  [I4]  Let  {W,D)  be  a  default  theory  and  E  a  formula  set.  We  de¬ 
fine  Eq  —  W  and  for  all  k  >0, 


Ek+i  = 


9Li£i^^D,Ek\-a, 
and  E  \f  -^Pifii  =  1, . . 


n 


} 


Then,  E  is  an  extension  of{W,D)  iff  E  ~ 

Note  that  E,  the  whole  extension  to  build,  is  used  in  its  own  definition.  This  non 
constructive  characterization  is  also  an  argument  to  choose  a  “guess  and  check” 
method  as  we  have  done  in  this  work. 

It  is  important  to  note  that  a  default  theory  may  have  one  or  multiple  exten¬ 
sions  and  sometimes  no  extension  at  all.  Now,  we  give  some  additional  materials 
useful  for  the  understanding  of  the  rest  of  the  paper. 


Definition  1.  Let  E  he  an  extension  of  a  default  theory  {W,D),  its  Generating 
Default  Set  is 


GD{W,D,E)-}^  £b'-A,vi=l,...,n} 


Furthermore,  given  a  default  theory  (W,  D),  computing  its  extension  E  is  equiv¬ 
alent  to  finding  its  Generating  Default  Set  A  since  E  =  Th{W  U  cons{A))  [15]. 


Definition  2.  [16]  Given  a  default  theory  (W,D),  a  set  of  defaults  A  C  D 
is  grounded  if  A  can  be  ordered  as  a  sequence  satisfying  :  Vz  — 

1, . . .  ,n,  VF  U  cons{{5i, . . .  ,(5i_i})  h  pre(Si). 

Lemma  1.  [16]  Every  generating  default  set  is  grounded. 

Mf  (5  is  a  default  rule,  pre{5),  jus(S)  and  cons(S)  respectively  denotes  the  prerequi¬ 
site,  the  set  of  justifications  and  the  consequent  of  6.  These  definitions  will  be  also 
extended  for  sets  of  defaults. 
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The  problem  we  address  in  this  paper  consists  in  an  Extension  Computation 
Problem  (ECP)  that  can  be  defined  w.r.t.  our  heuristic  approach  by  the  following 
components. 

Definition  3.  ECP 

-  A  default  theory  {W,  D) 

-  The  set  CQV  ~  2^  of  possible  configurations  called  candidate  generating 
default  sets. 

-  Given  a  candidate  generating  default  set  C  e  CQV,  the  candidate  extension 
associated  to  C  is 

CE{W,  D,  C)  —  Th{W  U  {con5((^)  |  5  G  C})  ^ 

Given  an  ECP,  a  solution  is  a  candidate  generating  default  set  C  G  CQV  such 
that  CE{W,D,C)  is  an  extension  w.r.t.  theorem  1. 

The  last  step  of  our  heuristic  approach  consists  in  defining  an  evaluation 
function  in  order  to  compute  the  fitness  of  a  candidate  generating  default  set  C 
w.r.t.  the  notion  of  solution.  This  evaluation  relies  on  the  four  intermediate 
functions  described  below. 

/o  rates  if  the  candidate  extension  is  consistent  or  not. 

f  (C'\  —  CE{C)  is  consistent 
jL  otherwise 

fi  rates  the  correctness  of  the  candidate  generating  default  set  with  respect  to 
the  definition  1. 


fi{C)  =  E^^i7r{Si)  where  n  ~  card{D) 
with  TT  defined  as  follows. 


(Si  €  C 

CE{C)  f- 

'3},CE(,C)  h 

TT 

true 

true 

true 

¥ 

true 

true 

false 

0 

true 

false 

true 

k 

true 

false 

false 

k 

false 

true 

true 

0 

false 

true 

false 

k 

false 

false 

true 

0 

false 

false 

'  false 

0 

k  is  Q,  positive  number  that  represents  a  penalty  given  to  each  default  that 
has  been  wrongly  applied  or  wrongly  not  applied. 

/2  rates  the  level  of  groundedness  of  the  candidate  generating  default  set. 

/2(C)  -  card(r) 

where  P  is  the  maximal  grounded  subset  of  C. 

^  We  use  CE{C)  instead  of  CE{W,D,  C)  when  it  is  clear  from  context. 
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fs  definitely  checks  this  property 


0  if  C  is  grounded 
1  otherwise 


The  complete  evaluation  function  is  defined  w.r.t.  previous  components  by 
taking  into  account  experimental  tuning  and  theoretical  properties. 

Definition  4.  Given  a  Default  theory  {W,  D),  a  candidate  generating  default  set 
C  €  CQD,  the  evaluation  of  C  is  defined  by  evah  CQV  — >  Z  U  {T,  ±} 

ifh{C)  =  l 

then  eval{C)  =  T 
else  if  fi{C)  =  0  and  fsiC)  =  0 
then  eval{C)  =  ± 
else  eval{C)  =  /i(C)  -  /2(C) 


Theorem  2.  A  solution  of  an  ECP  is  a  set  C  G  CQD  such  that  eval{C)  —  X. 
We  now  describe  the  different  methods  that  we  propose  to  solve  an  ECP. 


3  Genetic  Algorithms 

Genetic  Algorithms  [8,6]  are  based  on  the  principle  of  natural  selection.  We  first 
consider  a  population  of  individuals  represented  by  their  chromosomes.  Each 
chromosome  represents  a  potential  solution  to  an  ECP.  Applying  a  genetic  algo¬ 
rithm  consists  in  generating  better  and  better  individuals  by  evaluating,  select¬ 
ing,  mating  (crossing  and  mutating)  them. 

A  representation  scheme  consists  of  the  two  following  elements:  a  chromosome 
language  G  defined  by  a  chosen  size  and  an  interpretation  mapping  to  translate 
chromosomes  in  term  of  generating  default  sets,  which  provides  the  semantics 
of  the  chromosomes.  In  our  context,  for  each  default  encode  in 

the  chromosome  the  prerequisite  a  with  one  bit,  and  all  justifications  ^1,  ...,/?n 
conjointly  with  one  other  bit.  Therefore,  given  a  set  of  defaults  D  =  {5i,  •  ■  • ,  ^n} 
the  size  of  the  chromosome  will  be  2n  and  the  chromosome  language  Q  is  the 
regular  language  (0  +  (i.e.  strings  of  2n  bits).  Given  a  chromosome  G  ^  Q, 

G\i  denotes  the  value  of  G  at  occurrence  i,  I  <  i  <  2n.  The  interpretation 
mapping,  defining  the  semantics  of  the  chromosomes  (also  called  its  phenotype), 
can  be  formally  described  as  : 

Definition  5.  Given  a  set  of  default  D  and  chromosome  language  Q,  an  inter¬ 
pretation  mapping  is  defined  as 


(j)\G  ^  D 
G  D, 


{truCy  false}such  that  : 

j  true  ifG\2i-i  —  1  and  G\2i  =  0 
\  false  in  other  cases 


Therefore,  the  chromosomes  encode  the  candidate  generating  default  sets  as  : 
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Definition  6.  Given  a  default  set  D,  a  chromosome  G  eQ,  the  candidate  gen¬ 
erating  default  set  associated  to  G  is  : 

CGD[D^G)  =  €  D  I  (l){G^5i)  =  true} 

According  to  the  definition  3,  every  chromosome  G  induces  a  candidate  exten¬ 
sion  CE{W,D,CGD{D,G))  denoted  CE{G)  when  it  is  clear  from  the  context. 
Intuitively,  for  a  default  5^,  if  G|2i_i  —  1  then  its  prerequisite  is  considered  to  be 
in  CE{G)  and  if  G\2i  =  0  no  negation  of  its  justifications  is  assumed  to  belong 
to  CE{G). 

Example  1.  Let  {W,D)  -  ({a},  ^})  be  a  default  theory.  We  get  : 

CGZ)(100011)  =  {^}  and  (7£J(100011)  =  T/i({a,c})  which  is  really  an  exten¬ 
sion  but  also  C'GD(lOlOll)  =  and  CE(lOlOll)  =  r/i({a, c, -^6}) 

which  is  not  an  extension. 

The  GA  we  use  deals  with  some  intermediate  populations  as  it  can  be  sum¬ 
marized  by  figure  2. 


^  ^  ^  ^sel  ^  Pparents  ^  P children 

\ _ ^ _ J 

Fig.  2.  Main  steps  of  the  GA 


Generation  of  the  initial  population  P  is  crucial  to  the  efficiency  of  GA.  The 
most  simple  way  is  a  random  generation  but  this  does  not  take  into  account 
the  default  theory  of  interest.  A  more  efficient  way  consists  in  building  chromo¬ 
somes  whose  phenotypes  are  grounded  (consistent)  subsets  of  D.  We  introduce 
a  probability  of  insertion  of  a  default  in  the  candidate  generating  default  set  pi 
to  randomly  create  a  candidate  and  we  randomly  associate  to  each  default  5 
of  D  a  number  ps  G  [0, 1].  The  induction  definition  below  gives  by  fixpoint  the 
candidate  generating  default  set 

—  Aq  =  0,  Dq  =  D, 

-  Wj  >  0,  G  Dj_i,W  U  cons{Aj-i)  h  pre{S), 

Aj  =  Aj-i  U  {5}  if  ps  <  Pi  and 

W  U  cons{Aj^i  U  {^}))  \/  ± 

=  Aj-i  otherwise 

Then  a  chromosome  Goo  can  be  chosen  randomly  from  {G\CGD{D,  G)  =  Z\oo}. 
We  also  guarantee  that  all  the  chromosomes  of  the  initial  population  are  different. 
If  we  add  the  condition 


V/?  G  jus{6),  W  U  cons{Aj^i  U  {5})  1/  ^(5 
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to  the  inductive  part  of  the  construction  we  are  able  to  generate  an  initial  pop¬ 
ulation  of  incrementally  non-conflicting  grounded  phenotypes  [9].  However,  we 
never  completely  check  that  all  defaults  are  not  conflicting  because  our  goal  is 
not  to  directly  build  a  generating  default  set.  As  mentioned  in  the  introduction, 
we  think  that  this  task  is  too  difficult  for  a  classical  algorithm  and  we  just  search 
good  starting  points  for  our  algorithm.  Note,  that  for  a  technical  reason  explained 
below  5p,  the  size  of  the  population,  is  such  that  37Vp,  Sp  =  . 

Then,  we  build  where  chromosomes  of  P  are  ordered  w.r.t.  their  eval¬ 
uation  and  where  two  identical  chromosomes  are  represented  only  once.  The 
ordering  ^  of  the  individuals  is  the  natural  extension  of  the  usual  ordering  of  Z 
extended  with:  \/x  £  Z,x  ^  T  and  \/x  E  Z,  P  ^  x. 

After  that,  the  purpose  of  the  selection  stage  is  to  generate  a  selected  popula¬ 
tion  Paei  containing  chromosomes  with  the  best  rates  according  to  the  evaluation 
function.  Furthermore,  we  try  to  keep  a  large  diversity  of  selected  chromosomes 
by  introducing  a  Hamming  distance  Hd  (Hamming  distance  is  the  number  of 
differing  bits  between  two  chromosomes).  Psei  is  defined  as  the  iVp-first  chromo¬ 
somes  of  Px  respecting  the  Hamming  distance. 

We  choose  the  ranking  selection  to  generate  the  parent  population  Pparents- 
The  best  chromosome  in  Psei  is  duplicated  k  times,  the  second  one  A;  —  1  times, 
...,  and  the  one  1  time  in  Pparents- 

Genetic  operators  are  now  applied  on  Pparents  In  order  to  generate  the  off¬ 
spring  Pchiidreu’  They  are  controlled  by  a  crossover  probability  Pc  and  a  mutation 
probability  Pm-  The  crossover  is  performed  as  : 

-  select  randomly  two  chromosomes  A  =  (ai,...,a2n)  and  B  =  (6i,...,62n) 
in  Pparents 

-  generate  randomly  a  number  r  G  [0, 1] 

-  if  r  <  pc  then  the  crossover  is  possible; 

•  select  a  random  position  p  e  {l,...,2n— 1} 

•  the  chromosomes  (ai,  ...,ap,  Up+i, ...,  a2n)  and  (5i, ...,  6p,  6p4-i, ...,  52n) 
generate  the  two  new  chromosomes  (ai, ...,  Up,  6p+i, ...,  62n)  and 

(6i,  ...,  5p,  Up-(-l ,  U2n)  that  are  put  in  PchUdren- 

-  else  A  and  B  are  put  in  PchUdren  without  crossover 

Mutation  is  defined  as  : 

-  for  each  chromosome  G  €  PchUdren  and  for  each  bit  bj  in  G,  generate  a 
random  number  r  e  [0, 1], 

-  Hr  <  Pm  then  mutate  the  bit  bj  (i.e.  flip  the  bit). 

The  population  obtained  after  these  operations  becomes  the  current  popula¬ 
tion  and  will  be  the  new  input  of  the  whole  process  described  previously.  This 
full  process  is  repeated  to  generate  successive  populations  in  which  the  best 
chromosome  w.r.t.  the  evaluation  function  represents  the  current  best  solution 
to  the  ECP.  If  a  chromosome  G  such  that  eval{CGD{D,G))  =  ±  appears  in 
a  population  then  the  method  stops  and  CE{G)  is  an  extension  for  the  given 
default  theory.  Otherwise,  it  stops  when  a  maximal  number  of  populations  to  be 
explored  is  reached. 
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4  Ant  Colony  Optimization 

Ant  Colony  Optimization  (AGO)  metaheuristics  [4,3]  have  been  inspired  by  the 
observation  of  the  collective  behaviour  of  ants  when  they  are  seeking  food.  Let 
us  suppose  that  there  are  many  ants  in  a  nest  and  that  we  deposit  food  in  a 
place  linked  to  the  nest  by  two  different  paths  Pi  and  P2,  such  that  Pi  is  shorter 
than  P2.  At  the  beginning  of  their  exploration  approximatively  the  same  number 
of  ants  will  choose  one  path  or  the  other.  But,  after  few  minutes,  most  of  the 
ants  will  use  the  shortest  path  Pi .  The  emergence  of  this  shortest  preferred  path 
is  explained  by  the  following  points  : 

—  every  ant  puts  a  little  bit  of  pheromone  all  along  its  walk 

—  every  ant  directs  itself  by  doing  a  probabilistic  choice  biased  by  the  amount 
of  pheromone  that  it  finds  on  each  possible  path 

—  the  pheromone  evaporates 

Thus,  the  amount  of  pheromone  on  PI  increases  faster  than  on  P2  since  in  a 
same  duration  a  greater  number  of  ants  choose  this  path.  As  a  consequence,  a 
greater  number  of  ants  choose  PI  since  its  attractivity  becomes  higher.  By  rein¬ 
forcement,  the  amount  of  pheromone  on  P2  decreases  and  this  on  PI  increases 
directing  almost  all  ants  on  the  shortest  path. 

This  collective  behavior  based  on  a  kind  of  shared  memory  (the  pheromone) 
can  be  used  for  the  resolution  of  combinatorial  problems  that  can  be  encoded 
as  the  search  of  an  optimal  path  in  a  graph.  For  the  ECP  in  Default  Logic  we 
propose  the  following  encoding. 

Definition  7.  The  default  graph  of  a  default  theory  {W,D)  is 

G{W,  D)  =  {Du  {in,  out].  A) 

where  each  default  becomes  a  vertex  and  in  and  out  are  two  particular  vertices 
added  to  the  default  set.  A  is  the  arc  set  defined  by 

A  =  {(m,  (^),  V(^  G  D|kF  I-  pre{5)  and  V/?  G  jus{5)W  U  cons{(^)  1/  -i/?} 

^  ^  D‘^,S  ^  6'}  U  {{6,out),W5  G  D}  U  {{in,  out)} 

In  addition,  each  arc  {i,j)  G  A  is  weighted  by  an  artificial  pheromone  cpi^j  that 
is  a  positive  real  number. 

Definition  8.  Given  a  default  theory  {W,D),  a  path  P  from  in  to  out  in 
G{W,D),  the  candidate  generating  default  set  associated  to  P  is:CGD{D,P)  = 
PnD. 

In  the  sequel,  we  identify  vertices  and  defaults  and  we  indifferently  use  P  as  a 
path  in  the  graph  or  as  a  candidate  generating  default  set  by  discarding  in  and 
out  if  necessary.  Thus,  the  goal  of  AGO  is  to  find  a  path  from  in  to  out  that 
corresponds  to  a  true  generating  default  set. 
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We  do  not  systematically  put  an  arc  from  in  to  every  default  in  D,  since  we 
want  to  start  the  search  by  defaults  that  can  be  applied  in  W.  In  addition,  after 
the  building  phase,  we  remove  from  A  the  arcs 

(5,  _)  and  (_,  5)  if  3/3  e  jus  {5),  W  U  cons (5)  I — >/^ 

because  such  a  default  <5  (like  :~)  can  never  be  applied  so  it  is  useless  to  build 
path  including  S. 

{6,  S')  if  WU  cons{S)  U  cons  {S')  h  ± 

since  S  and  S'  are  incompatible  together.  It  does  not  forbid  these  two  defaults  to 
appear  in  the  same  path  but  it  reduces  the  search  space.  Obviously,  many  other 
efforts  could  be  done  to  prune  the  graph  by  a  deep  analysis  of  the  default  theory 
but  this  could  become  very  expensive. 

At  the  beginning,  the  pheromone  on  every  arc  of  the  graph  is  initialized  to  1 
in  order  to  give  equal  chance  to  all  paths.  During  the  process  this  pheromone 
globally  evaporates  and  increases  on  arcs  that  are  on  good  paths  in  order  to 
concentrate  a  great  number  of  ants  on  the  better  parts  of  the  graph. 

In  order  to  guide  each  ant  during  its  journey  from  in  to  out  we  also  use  a 
local  evaluation  based  on  the  function  loc. 

Definition  9.  Let  P  a  path  in  the  graph  and  S  a  default.  We  say  that: 

—  S  is  grounded  in  P,  if  WU  cons{P)  h  pre{S) 

-  S  is  compatible  with  P,  if\//3  G  jus{S),  W  U  cons{P)  U  0  is  consistent 


and  we  define 

fl  if  S  is  grounded  in  P 
and  compatible  with  P 
0  otherwise 

This  local  function  combined  with  the  recorded  pheromone  leads  to  the  definition 
of  the  attractivity  of  a  vertex  S  for  an  ant  staying  on  the  last  vertex  of  a  partial 
path  P  between  in  and  out. 

Definition  10.  Let  G{W,  D)  =  {V,  A)  a  default  graph,  P  a  path  from  vertex 
in  to  vertex  Vi.  We  define  'R{vi,P)  =  {vj  G  V\P  s.t.  (vi,Vj)  G  Aj  the  set  of 
vertices  reachable  from  Vi  and  the  attractivity  of  each  vertex  Vj  G  'JZ{vi,P) 


A{vi,Vj,P) 


_ (fij  *  loc{P,Vj) _ 

J^Vk&7l{vi,P)  *  loc{P,  Vk) 


On  each  vertex  Vi  during  its  travel  from  in  to  out,  an  ant  chooses  the  next 
vertex  by  a  random  choice  between  all  reachable  vertices  vj  and  this  choice  is 
biased  by  the  values  A{vi,  Vj,P).  By  definition  of  the  function  loc  the  only  paths 
that  can  be  explored  correspond  to  sets  of  defaults  that  are  grounded  (in  sense 
of  def.  2)  and  maximal.  By  this  way  we  discard  candidate  generating  default 
sets  that  have  obviously  no  chance  to  be  a  solution  of  the  ECP  and  the  search 
process  focused  on  “better”  candidates. 

So,  the  main  iteration  of  the  whole  algorithm  is  : 
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-  release  N  ants  at  vertex  in 

-  evaluate  their  paths  Pi,  i  =  1 . . .  n,  from  in  to  out  by  eval{CGD{D,  Pi)) 

-  increase  the  pheromone  on  better  paths  :  (p{ij)  4-  (^(i,;)+0.9^~\ Varc(i,  j) 
in  the  best  k  paths 

-  decrease  (1%)  the  pheromone  on  every  arc  (evaporation) 

-  if  the  evaluation  of  the  best  path  is  JL  then  the  algorithm  stops  and  the 
best  path  P  gives  an  extension  CE{CGD{D,  P)),  otherwise  a  next  iteration 
starts  until  the  maximum  number  of  iterations  is  reached 


5  Local  Search 

Local  Search  (LS)  is  a  class  of  powerful  methods  to  tackle  difficult  optimization 
problems.  The  development  of  modern  metaheuristics  such  as  Tabu  Search  or 
Simulated  Annealing  [1]  has  greatly  increased  their  use  and  their  efficiency. 

In  this  work,  LS  will  not  be  used  as  search  heuristic  alone  but  combined  with 
GA  and  AGO  to  get  better  results.  For  an  ECP,  a  chosen  number  of  the  best 
individuals  (in  the  population  or  in  the  set  of  paths)  are  improved  by  a  number  of 
LS  iterations.  After  this  improvements,  they  are  put  back  in  the  population  (or 
set  of  paths)  for  the  next  GA  or  AGO  iteration.  This  acts  as  an  improve/repair 
stage.  Therefore  results  depend  on  the  number  of  individuals  to  submit  to  LS 
and  on  the  number  of  iterations  to  perform. 

The  LS  framework  can  be  described  as  follows  :  given  a  finite  set  of  config¬ 
urations  S  and  a  cost  function  /,  the  purpose  of  the  method  is  to  determine  an 
optimal  s*  such  that  Vs  e  S,  f{s*)  <  /(s). 

Local  search  mainly  relies  on  the  notion  of  neighborhood,  which  allows  the 
search  algorithm  to  move  from  a  configuration  to  another  one,  in  order  to  reach 
an  optimum.  Therefore,  given  a  neighborhood  function  J\f:S  --^2^  and  an  initial 
configuration  sq,  a  LS  algorithm  produces  a  series  of  configurations  (5i)ie[o..n] 
such  that  Vi,  Sj+i  G  M{si). 

Here,  the  search  space  is  the  previously  defined  candidate  generating  defaults 
set  CQV.  We  keep  the  previous  evaluation  function  eval  (def.  4)  as  cost  function. 
We  just  focus  here  on  the  definition  of  the  neighborhood. 

Concerning  the  moves  in  this  search  space,  according  to  the  definition  of 
candidate  extensions  associated  to  individuals,  they  will  be  defined  w.r.t.  the 
notion  of  applied  default.  We  impose  that  two  neighbor  candidate  generating 
default  sets  differ  only  by  one  of  their  defaults.  The  neighborhood  can  be  defined 
as  a  function  :  AT:  CQV  2^^^  such  that  Af{C)  =  {C'  €  CQV  \  C'  =  C\J{5},5  ^ 
Cwa  =  C-{5},5^C}. 

In  order  to  experiment  local  search  techniques  and  their  combination  with 
the  previously  described  methods,  two  methods  are  used  to  explore  the  benefits 
of  two  different  and  representative  managements  of  the  moves:  Descent  with 
Random  Walk  (DRW)  and  Tabu  Search  (TS). 

The  DRW  consists  in  choosing  at  each  iteration  the  best  neighbor  which 
replaces  the  current  configuration  only  if  it  is  better.  Using  this  approach,  a 
local  optimum  is  always  reached.  A  random  walk  principle  is  added  to  escape 
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from  local  optimum  by  moving  from  a  current  configuration  to  another  having 
a  worst  evaluation  with  a  certain  probability. 

TS  consists  in  moving  from  a  configuration  to  its  best  allowed  neighbor  which 
is  not  necessarily  better  than  this  current  configuration.  The  allowed  neighbors 
are  configurations  that  are  not  forbidden  by  the  tabu  list  and  each  current  con¬ 
figuration  is  recorded  in  this  list.  Therefore,  possible  cycles  are  avoided  thanks  to 
this  tabu  list.  A  so-called  aspiration  condition  insures  that,  if  the  neighborhood 
contains  a  better  configuration  than  the  current  one,  then  it  will  be  accepted  as 
the  new  current  configuration  even  if  it  is  in  the  tabu  list. 

Of  course,  there  exists  many  variants  and  extensions  of  these  basic  principles. 


6  Experimental  Results  and  Conclusion 

We  report  here  some  experimental  results  on  GA,  AGO  and  GA+LS  systems 
that  we  have  implemented  in  Sicstus  Prolog  3.8.3  (we  have  also  implemented 
ACO-l-LS  but  due  to  a  lack  of  space  we  only  point  out  here  some  of  our  results). 

Diversification  :  Table  1  refers  to  the  influence  of  the  Hamming  distance 
for  the  problem  /iam_6_2  that  encodes,  with  45  defaults,  a  Hamiltonian  cycle 
problem  as  in  [2]. 

Tests  have  been  done  by  30  runs  per  Hamming  distance  hd  with  parame¬ 
ters  Sp  =  465,  Pc  =  0.8,  Pm  =  0.1,  Pi  ^  0.9,  an  initial  incrementally  non¬ 
conflicting  grounded  population  and  a  maximum  number  of  iterations  equals  to 
200.  %s  is  the  number  of  successful  runs,  ani  the  average  number  of  iterations, 
at  the  average  time  in  seconds  for  a  run,  ati  the  average  time  in  seconds  for 
one  iteration,  anis  the  average  number  of  iterations  for  the  successful  runs  and 
ats  the  average  time  in  seconds  for  one  successful  run.  It  shows  the  importance 
of  population  diversity  to  increase  the  stability  of  the  method  (in  number  of 
iterations)  and  to  speed  up  each  iteration  by  decreasing  the  size  of  the  popula¬ 
tion.  It  demonstrates  also  that  a  too  high  selective  pressure  (i^d  >  13)  strongly 
reduces  the  chances  to  have  a  successful  run  by  decreasing  too  much  the  size  of 
the  selected  population  (and  then  the  offspring). 

LS  tuning  :  In  order  to  get  good  performance  improvements  from  the  combi¬ 
nation  of  LS  and  GA,  we  have  to  adjust  the  parameters  of  the  two  LS  algorithms. 
Concerning  DRW  the  tuning  consists  in  determining  the  best  value  for  the  ran¬ 
dom  walk  parameter.  Experiments  provide  us  a  value  around  0.05  (the  aim  is  to 
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avoid  too  stochastic  moves).  Concerning  TS,  we  have  to  adjust  the  length  of  the 
tabu  list  (the  tabu  tenure) .  In  fact  the  more  important  parameters  are  the  depth 
of  the  LS  used  and  the  number  of  candidates  given  to  LS  after  each  GA  (or  AGO) 
iteration.  For  both  LS  algorithms,  it  appears  that  5  LS  iterations  on  the  5  best 
candidates  are  a  good  compromise  to  get  interesting  results.  Results  obtained 
with  GA+DRW  and  GA+TS  are  given  in  table  2.  Due  to  the  small  number  of 
iterations,  it  appears  that  the  tabu  tenure  does  not  really  affect  the  results.  Due 
to  this  specific  use,  DRW  and  TS  can  be  considered  as  a  way  to  reach  quickly  a 
local  optimum  from  a  good  configuration.  Their  respective  performances  depend 
on  the  two  different  heuristics  they  used  to  explore  neighborhood.  Moreover,  pa¬ 
rameters  can  be  finely  tuned  according  to  each  problem. 

Results  :  Table  2  provides  us  information  on  the  performances  of  our  meth¬ 
ods  (with  the  notations  of  table  1).  We  report  our  best  AGO  experiments  in 
which  we  use  N  =  100  ants  and  the  k  =  7  best  paths  for  reinforcement.  We  can 
remark  the  great  impact  of  LS  on  the  number  of  iterations  of  GA  while  only 
5  individuals  of  the  whole  population  are  improved  at  each  GA  iteration.  This 
also  allows  us  to  compare  the  performance  in  time  of  GA  w.r.t.  AGO  and  to 
compare  our  systems  with  DeRes  [2] . 

Forthcoming  works  :  Our  methodology  can  be  easily  adapted  to  other 
variants  of  default  logic  provided  that  we  adapt  the  function  eval  to  the  definition 
of  extension  in  the  targeted  default  logic.  Moreover,  our  systems  are  able  to  do 
query  answering  in  full  Reiter’s  Default  logic  and  this  will  be  described  in  a  next 
paper. 

We  have  to  mention  that,  on  logic  programs  with  stable  model  semantics  (a 
subcase  of  default  logic),  the  system  Smodels  [12]  has  best  performances.  We 
think  that  it  is  because  the  benefit  of  our  approaches  has  no  effect  on  this  kind 
of  problem  whose  complexity  (iVP-complete)  is  less  than  -  complete.  But, 
previous  people  example  can  only  be  encoded  in  full  Reiter’s  default  logic  that  is 
beyond  the  scope  of  Smodels.  Another  interesting  feature  of  our  approach  is  its 
ability  to  do  a  kind  of  anytime  reasoning  since  when  the  method  stops  without 
giving  an  extension,  we  get  some  approximate  solution  that  can  be  useful. 

An  interesting  way  to  explore  is  to  investigate  how  we  could  derive  benefits 
from  the  blocking  set  and  supporting  set  structures  introduced  in  [7].  It  can 
be  useful  to  define  a  more  suitable  neighborhood  in  the  LS  or  to  introduce  a 
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reparation  mechanism  in  GA  or  to  forbid  some  partial  paths  in  AGO.  Another 
question  to  deal  with  is  the  non  existence  of  extension  problem.  Actually,  if  a 
default  theory  has  no  extension  our  systems  stop  after  having  done  their  maximal 
number  of  iterations  and  we  can  not  attest  that  there  is  an  extension  or  not.  But, 
the  only  way  to  assert  that  a  general  default  theory  (W,  D)  has  no  extension  is 
to  explore  the  whole  set  CQT>  =  2^  and  this  is  not  practicable  for  non  trivial 
cases.  Nevertheless,  [7]  gives  some  sufficient  conditions  of  non  existence  that  can 
be  helpful  in  our  work. 
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Abstract  The  problem  of  computing  ^-minimal  models,  that  is,  models 
minimal  with  respect  to  a  subset  X  of  all  the  atoms  in  a  theory,  is  very 
relevant  for  computing  circumscriptions  and  diagnosis.  Unfortunately, 
the  problem  is  NP-hard.  In  this  paper  we  present  two  novel  algorithms 
for  computing  X-minimal  models.  The  advantage  of  these  new 
algorithms  is  that,  unlike  existing  ones,  they  are  capable  of  generating 
the  models  one  by  one.  There  is  no  need  to  compute  a  superset  of  all 
minimal  models  before  finding  the  first  X-minimal  one.  Our  procedures 
may  use  local  serach  techniques,  or,  alternatively,  complete  methods. 
We  have  implemented  and  tested  the  algorithms  and  the  preliminary 
experimental  results  are  encouraging. 


1  Introduction 

Minimal  model  computation  is  a  crucial  task  in  many  reasoning  systems  in  Artificial 
Intelligence,  including  Logic  Programming,  Nonmonotonic  Reasoning,  and  Diagnosis 
[Re87,Mc80,dKW87].  Indeed,  a  considerable  effort  has  been  made  to  analyze  the 
complexity  of  this  task  and  to  build  efficient  algorithms  and  systems  that  solve  it  [e.g. 
BD96,KL99,.JNS00]. 

In  this  paper,  we  consider  a  more  general  computational  problem-  the  problem  of 
computing  X-minimal  models.  When  we  look  for  X-minimal  models,  we  search  for 
models  that  are  minimal  with  respect  to  a  subset  X  of  all  the  atoms  in  the  theory.  The 
task  of  computing  minimal  models  is  a  special  case  of  generating  X-minimal  models, 
taking  X  to  be  all  the  atoms  in  the  theory.  X-minimal  models  are  particularly  relevant 
in  Diagnosis  and  Circumscription  [Re87,Mc80,Li85].  In  the  logical  approach  to 
Diagnosis,  the  artifact  to  be  diagnosed  and  the  behavior  of  the  system  are  encoded  as 
a  set  of  logical  sentences  called  the  system  description  and  the  observations, 
respectively.  The  components  of  the  system  are  represented  by  constants,  and  their 
status  -  whether  or  not  they  are  functioning  well  -  is  indicated  by  a  special  predicate 
called  an  abnormality  predicate  and  denoted  ab(.).  Normally,  we  assume  that  all  the 
abnormality  predicates  are  false,  that  is,  that  components  in  the  system  well  behave. 
Once  there  is  a  fault,  the  theory  composed  of  the  system  description  and  the 
observations  becomes  logically  inconsistent  if  we  assume  that  none  of  the 
components  is  abnormal.  To  resume  consistency,  we  must  assume  that  some 
components  are  malfunctioning.  We  prefer  to  explain  the  inconsistency  with  a 
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minimal  subset  of  abnormalities.  It  does  not  make  sense  to  assume  that  a  set  of 
components  are  broken  when  the  assumption  that  only  a  subset  of  this  set  is 
malfunctioning  can  explain  the  faulty  behavior  of  the  system.  This  is  called  ‘The 
Principle  of  Parsimony”.  By  this  principle,  we  assume  that  only  minimal  subsets  of 
components  are  faulty,  or  in  other  words,  we  look  for  models  that  are  minimal  with 
respect  to  the  abnormality  predicates. 


Fig.  1.  An  example  circuit 


The  systems  descriptions  and  observations  are  usually  represented  in  first  order 
logic.  For  the  sake  of  simplicity,  we  will  use  propositional  logic  in  this  paper.  The 
algorithms  presented  here  can  be  used  for  function-free  first-order  minimal  model 
computation  by  first  grounding  the  theories  involved.  Alternatively,  the  algorithms 
shown  here  can  serve  as  a  basis  for  developing  algorithms  tailored  for  first-order 
logic. 

As  an  example  for  model-based  diagnosis  using  minimal  models,  consider  the 
simple  circuit  shown  in  Figure  1.  Assuming  ABl  and  AB2  mean  that  gate  “not-l”  and 
“not-2”,  respectively,  are  malfunctioning,  the  system  description  {SD)  for  this  gate  is: 

—lABl  — >  [/wl  —}In2]i 

— \AB2  — >  [/w2  <-*>  — lOut  ] 

Now,  assume  that  Ini  is  0  and  Out  is  1,  indicating  that  the  circuit  is  faulty.  The 
observations  (OBS)  in  this  case  are  {-tinl ,Out] .  If  we  assume  that  both  gates  are 
normal  and  t^e  the  theory  that  is  the  union  of  SD,  OBS,  and  the  literals  {-^ABl, 
~iAB2},  -  we  get  an  inconsistent  theory.  However,  if  we  consider  the  theory 
consisting  of  the  union  of  SD  and  OBS  alone,  and  we  look  at  the  X-minimal  models  of 
this  theory  taking  X  to  be  [ABl,  AB2},  we  obtain  two  such  models,  in  each  of  which 
only  one  gate  is  abnormal.  Hence  the  diagnosis  for  the  above  system  and  observations 
is  that  either  the  first  or  the  second  (but  not  both)  circuit  is  faulty. 

The  circuit  example  also  illustrates  why  a  demand-driven  computation  of  the  X- 
minimal  models  is  advantageous.  Each  X-minimal  model  explains  the  faulty  behavior 
of  the  system  by  suggesting  a  minimal  set  of  components  that  may  be  abnormal.  If  we 
are  given  the  models  one  by  one,  we  can  test  the  suspect  components  while  the  next 
model  is  being  computed. 

The  paper  is  organized  as  follows.  After  presenting  some  basic  definitions  and  known 
results  in  Section  2,  we  present  two  demand-driven  algorithms  for  X-minimal  model 
computation  in  Section  3.  Both  algorithms  may  be  implemented  either  using  local 
search  methods  or  complete  methods.  X-minimal  models  are  also  very  relevant  in 
computing  Circumscription.  We  elaborate  on  that  in  Section  4.  In  Section  5  we  report 
on  experiments  done  on  the  algorithms  developed  and  in  Sections  6  and  7  we  present 
related  work  and  concluding  remarks. 
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2  Preliminary  Definitions 

In  this  section  we  provide  some  basic  terminology  used  throughout  the  paper. 

•  Literal  -  propositional  symbol  (atom)  (positive  literal)  or  its  negation,  (negative 
literal). 

•  Clause  -  disjunction  of  literals. 

•  CNF  theory  -  conjunctive  normal  form,  a  conjunction  of  clauses.  All  the  theories 
in  this  paper  are  assumed  to  be  in  CNF.  Hence  by  theory  we  mean  a  set  of 
clauses.  A  theory  is  Horn  if  and  only  if  each  clause  in  the  theory  contains  at  most 
one  positive  literal. 

•  Positive  graph  of  a  theory  -  Let  T  be  a  theory.  The  positive  graph  of  T  is  an 
undirected  graph  (V,  E)  defined  as  follows:  V  =  /  P|  P  is  a  positive  literal  in 
some  clause  in  P},  E=  {(P,  Q)\  P  and  Q  appear  positive  in  the  same  clause}. 

•  Vertex  cover  -  Let  G  =  (V,  £)  be  a  graph.  A  vertex  cover  of  G  is  a  set  V  c  V  such 
that  for  each  e  e  E  there  is  some  vg  V*  such  that  vg  e. 

•  Vertex  cover  of  a  theory  -  Vertex  cover  of  the  poistive  graph  of  the  theory.  Note 
that  if  all  the  atoms  of  a  vertex  cover  of  a  theory  are  instantiated,  the  theory 
becomes  Horn. 

•  Model  -  a  truth  assignment  to  all  the  atoms  in  the  theory  that  makes  the  theory 
true. 

•  Pos  (M)  -  the  set  of  the  atoms  assigned  true  in  a  model  M. 

•  Lit{v)  -  A  representation  of  a  truth  assignment  v  as  a  set  of  literals.  For  example, 
if  v=/P=true,  G=false,  P=falsey,  then  Lit(v}={P,  -iQ„  -iPy. 

•  Unit  clause  -  clause  that  contains  only  one  literal. 

•  Unit  propagation  -  the  process  where  given  a  theory  P,  you  do  the  following 
until  there  is  no  change  in  the  theory  (no  new  clauses  are  generated  and  no 
clause  is  deleted):  you  pick  a  unit  clause  C  from  P,  delete  the  negation  of  C  from 
each  clause  and  delete  each  clause  that  contains  C. 

•  P^  5  -  is  the  result  of  unit  propagation  on  pU  S. 

•  {true  (false)}-  set  of  atoms  that  are  assigned  with  true  (false). 

•  Int^  (M)  -  the  value  (integer)  of  a  model  M  over  a  given  ordered  set  of  variables 

X={P^,. ..,Pp},  seen  as  a  binary  number  where  P^  is  the  most  significant  bit  (MSB 
or  MSV  -  most  significant  variable)  and  P^  is  the  least  significant  bit  (LSB  or 
LSV  -  least  significant  variable).  So,  for  example,  if  M=/'P=true,  G=false, 
P=falsey  then  Int^^Q,  (A/)=(10  in  binary  code)  =  2;  (Af)=(100  in  binary 

code)  =  4. 

•  X-Largest  (Smallest)  model  -  the  model  with  the  largest  (smallest)  value  (IntfM)) 
with  respect  to  a  given  ordered  set  of  variables  X~{P^,...,PJ. 

AT-minimal  model 

Let  P  be  a  theory  over  a  set  of  atoms  L,  XaL,  and  Af  a  model  for  P.  M  is  an  X- 

Super  of  another  model  Af’  if  and  only  if  pos(Af  ‘)rX'  is  a  proper  subset  of 

posfM)  nX. 
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M  is  an  X-minimal  model  for  T  if  and  only  if  there  is  no  other  model  M’  for  T  such 
that  A/  is  an  X-Super  of  M’.  If  M  is  an  X-minimal  model  for  X  =  L,  it  will  be  called 
simply  a  minimal  model.  It  has  been  shown  that  a  Horn  theory  has  a  unique  minimal 
model  that  can  be  found  in  linear  time  [DG84]. 

Find-X-minimal 

Input;  A  theory  T,  an  ordered  set  X==/P„  ...Pi}  which  is  a  subset  of  the 
atoms  in  T. 

Output:  true  if  7  is  satisfiable,  false  otherwise.  In  case  T  is  satisfiable,  the 
output  variable  Jlf  is  a  sniallestX^— miniinal  model  of  T  w.r.t.  the  ordering  {Pr 
....Pi}. 

1.  If  ~isat(r,A0  return  false; 

2.  For  i :  =  r  downto  1  do 

a. If  isHom(7)  then  M=HornMiniraalModel(7),  Goto  4. 

b. If  sat  (7  U  {^~^i }  >  M)  then  7:  =7  ®  {-if) } 
else 7:  =7®  {i^} 

3.  sat(7,M; 

4.  Return  true; 

Fig.  2.  Algorithm  Find-X-minimal 

Example  2.1  Suppose  a  theory  has  variables  P^.-.P^,  and  X={Pg,...PJ.  Assume 
further  that  7^  has  exactly  four  models  (ordered  from  the  X-smallest  to  the 
X-largest):  Mj^OOlOllO,  M, =0011000,  M,=  0100111,  and  M,=1100000.  M,  and  M, 
are  the  only  X-minimal  models  of  7^ 

Throughout  this  paper,  unless  stated  otherwise,  models  that  agree  on  the  truth 
assignments  given  to  all  the  variables  in  X  are  considered  identical. 

The  following  theorem  is  quite  interesting.  Its  proof  is  based  on  results  from 
Combinatorics  [Bo86].  We  provide  in  the  appendix  a  proof  suggested  by  Lomonosov 
[LoOO]. 

Theorem  2.2:  Let  7  be  a  theory  and  let  X  be  a  subset  of  the  atoms  that  are  used  in  7. 

n 
n 

2 


\ 

,  where  |X|=n. 
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S-X-Min  {T,X) 

Input:  A  theory  T,  An  ordered  ...Pj}  which  is  subset  of  the 

variables  in  T. 

Output:  true  if  T  is  satisfiable,  false  otherwise.  If  2"  is  satisfiable,  output  one 
by  one  all  X-minimal  models  of  T  from  the  smallest  to  the  largest  w.r.t 
the  order  {P^,  ...Pi}. 

1.  If  -iSat(T,A9  return  false. 

2.  ModelsTable  =  0. 

3.  For  i ;  =  0  to  2'‘-l  do: 

,  v:  =  instantiation  of  X  that  equal  i.  (Pi  least  significant,  P^ 
most  significant). 

.  If  V  is  not  an  A"-super  of  a  model  in  ModelsTahle 

then 

if  S!si(r{}Lit(v),M) 

Output  (M)\ 

Add  M  to  ModelsTahle; 


Fig.  3.  Algorithm  S-X>Min 


Find  AT-Minimal 

In  Figure  2  we  show  an  algorithm  for  computing  one  X-minimal  model  for  a 
theory.  The  algorithm  takes  0(\X\)  steps  and  uses  Of|X|j  calls  to  a  satisfiability  testing 
procedure.  A  similar  algorithm  was  shown  in  [BD96].  Find-X-minimal  tries  to  assign 
as  many  false  as  possible  and  calls  a  Horn  satisfiability  checker  once  there  are  enough 
instantiations  so  that  the  theory  becomes  Horn. 

Notes  on  Find-X-Minimal  ffor  future  useV 

Note  1:  if  the  theory  T  is  not  satisfiable  then  M  is  returned  with  the  value  it  was 
initialized  with. 

Note  2:  The  algorithm  uses  a  procedure  sat(T,M)  that  returns  true  iff  7  is  satisfiable. 
In  case  T  is  satisfiable,  Af  is  a  model  of  T.  Each  model  M  is  an  array  of  booleans,  M[i ] 
being  the  truth  value  assigned  to  P..  It  might  be  the  case  that  M  has  entries  for 
variables  not  appearing  in  T.  These  variables  will  be  assigned  false  by  sat(T,M).  We 
do  not  always  use  the  model  that  sat  returns.  In  implementations,  we  can  use  a 
version  of  sat  that  does  not  return  a  model  when  we  do  not  need  it. 

Note  3:  If  sat(T,M)  is  complete  then  Find-X-Minimal  is  also  complete,  otherwise 
Find-X-Minimal  is  not  complete. 
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3  New  Algorithms 

In  this  section  we  will  present  two  algorithms  for  computing  X-minimal  models. 

The  correctness  of  these  algorithms  can  be  proved  only  if  a  complete  SAT  procedure 
is  assumed.  Otherwise  the  algorithm  is  not  complete  and  may  generate  a  model  which 
is  not  X-minimal 

The  first  algorithm,  called  S-X-min,  goes  over  all  possible  instantiations  to  X,  using 
an  ordering  having  the  property  that  whenever  a  model  is  not  X-minimal,  then  it  must 
be  an  X-super  of  an  X-minimal  model  already  generated.  S-X-min  is  shown  in 
Figure  3. 

J-X-min  (r,X) 

Input:  A  theory  T,  a  subset  X  of  all  the  variables  in  T.  We  assume 
that 

Output:  true  if  T  is  satisfiable,  false  otherwise.  In  case  T  is 
satisfiable,  output  one  by  one  all  X-minimal  models  of  T  from 
the  smallest  to  the  largest.  Each  model  is  an  array  of  booleans 
M,  M[iJ  is  the  truth  value  assigned  to  P,. 

1.  Let  P„.i,  be  an  ordering  on  the  variables  in  T  such  that 

the  first  r  variables  are  all  the  variables  from  X,  Variable  Pq  will  be 
considered  the  least  significant  and  the  variable  P„.}  will  be 
considered  the  most  significant. 

2.  If  Find-X-minimaKT,  ...,P„.rA  M)  =  false  return  false; 

3.  ModelsTahle  =  0. 

4.  Output  {M)\  Add  M  to  ModelsTable. 

5.  Let  i  be  the  index  of  the  least  significant  variable  that  satisfies: 

0.  PiEX 

0.  M/^r7=  false 

0.  Pi  is  more  significant  than  another  variable  Pj  such  that  Pje 
X  andM//7=true;. 

if  there  is  no  such  variable  return  true, 
d.  M[i]=  true; 

7.  If  Find-A'-minimal  {T  ®Lit(M[n-l, ...ij),  {P^i, AfJ=false  ihen 
goto  5. 

8.  If  M  is  not  an  X-super  of  a  model  in  ModelsTable  then  goto  4.  Else 
goto  5. 


Fig.  4.  Algorithm  J-X-Min 


328  Chen  Avin  and  Rachel  Ben-EIiyahu  -  Zohary 


Lemma  3,1;  Algorithm  S-X-Min  is  correct. 

The  proof  is  omitted  due  to  space  constraints.  The  basic  argument  is  that  a  model 
which  is  not  X-minimal  must  be  an  Z-super  of  a  model  that  is  Z-smaller.  Since  the 
models  are  generated  in  an  increasing  integer  (/ng  order,  all  and  only  the  Z-minimal 
models  will  be  generated. 

The  second  algorithm  that  we  present  is  algorithm  J-X-min  shown  in  Figure  4.  For 
each  theory  T  there  is  a  (possibly  empty)  set  Q  of  all  the  Z-minimal  models  of  T.  You 
can  order  the  nZ-minimal  models  in  12  in  order  {M,,...,  M„}  where  M,  is  the  smallest 
Z-minimal  model  and  is  the  largest.  The  algorithm  is  based  on  the  following 
observation: 

Lenma  3.2:  Algorithm  Fihd-Z-minimal  (Figure  2)  will  always  return  the  smallest  X- 
minimal  model  for  some  variable  ordering.  (If  exists). 

The  proof  is  omitted  due  to  space  constraints.  Intuitively,  the  Lemma  is  true 
because  Find-Z-minimal  tries  to  assign  as  many  false  as  possible  and  backtracks  on 
this  choice  at  as  less  significant  bit  as  possible. 

Once  we  find  the  first,  smallest,  AT-minimal  model,  we  serach  for  the  next  one. 
Suppose  that  |Zj=4  and  the  smallest  model  is  0100....  There  is  no  point  in  checking  if 
truth  assignmnets  starting  with  0101  or  0110  are  models  because  it  is  obvious  that 
they  are  Z-super  of  the  first  model.  Hence  the  algorithm  will  "jump"  to  check  whether 
truth  assignmnets  starting  with  1000  may  be  models.  Hence  the  "J"  in  the  algorithm 
name. 

Theorem  3.3:  Algorithm  J-X-Mm  is  correct. 

Proof  (sketch):  Let  T  be  a  theory  and  let  Z  be  a  subset  of  its  variables.  If  T  is 
inconsistent,  the  algorithm  is  clearly  correct.  Assume  T  is  consistent.  First,  observe 
that  models  generated  in  Step  7  are  always  generated  from  the  Z-smallest  to  the  Z- 
largest.  Let  be  all  the  Z-minimal  models  of  T,  ordered  from  the  Z-smallest  to 

the  Z-largest  according  to  an  ordering  of  Z.  We  will  show  by  induction  on 

0  <t  <k  that  the  rth  model  that  J-Z-min  outputs  is  . 

Base  case:  Follows  from  Lemma  3,2. 

Case  t>0:  Assume  by  contradiction  that  M VM,  is  the  t’th  model  that  the  algorithm 
outputs.  By  the  induction  hypothesis,  it  must  be  the  case  that  IntJM  ,)  <  < 

Let  us  look  at  the  last  time  that  Step  5  of  the  algorithm  is  exacuted  just  before 
model  M'  is  sent  to  output. 

Let  i  is  the  index  that  the  algorithm  finds  at  this  last  step. 

Let  M*  be  an  instantiation  of  the  variables  in  T  defined  as  follows: 

M*[i]  =  true^  Af*/i-7,,..,n-r7={false}.  It  is  clear  that 
<  IntfM*)  ^  There  are  2  cases: 

1 .  <lntffd  )<  /ngM*),  then  there  is  contradiction  because  in  this  case 
must  be  an  Z-super  of  M,.,  (Some  of  the  variables  (P._,,...,PJ  that  were 

false  become  true  instead),  hence  the  algorithm  will  not  output  it. 

2.  Intjfd*)  <  Then  the  following  must  be  true: 

2.7  M7n-7,...,/;=:M/«-7, 

2.2  Int(Mfi-f..,,n-r])<Int(MJi-f...,n-r]), 
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But  by  Lemma  3.1  when  we  execute  Find-X-minimal  with  P„J  it 

returns  the  smallest  possible  value  of  P^J  and  therefore  M  cannot  be  a 

minimal  model  that  satisfies  2.1  &  2.2,  a  contradiction. 

It  is  left  to  show  that  is  the  last  model  sent  to  output.  This  is  obvious  because  all 
the  models  generated  after  are  not  X-minimal  and  hence  must  be  X-super  of  at 
least  one  of  all  the  X-minimal  models  that  are  already  in  ModelsTable.  □ 

Example  3.4  Consider  again  theory  from  Example  2.1,  and  suppose  J-X-min  is 
called  with  X-{P^...PJ.  The  first  (and  smallest)  X-minimal  model  returned  by  Find- 
X-minimal  is  Mj-0010110.  After  that,  at  Step  5,  we  choose  i=5  and  call  Find-X- 
minimal  (r  -^s,PJ,  X,M).  Find-X-minimal  will  return  model  0100111, 
which  is  the  2"^  and  last  minimal  model  of  T^.  At  Step  5  after  that  we  choose  i=6  and 
call  Find-X-minimal  (7;  ®  (PJ,  X,  M).  Find-X-minimal  will  return  model  M  = 
1100000.  is  an  X-super  of  My  and  therefore  will  not  be  sent  to  output.  In  the  next 
iteration,  no  i  will  be  found,  and  the  algorithm  will  terminate.  You  can  see  that  out  of 
16  possible  assignments  to  X,  only  3  models  were  considered  by  J-X-min. 


4  Computing  Circumscription 

In  this  section  we  will  show  how  our  algorithms  can  be  used  for  computing 
circumscription.  First,  we  will  formulate  deduction  in  circumscription  in  model- 
theoretic  terminology.  We  will  use  propositional  logic  version  of  definitions  from 
[GPP89,Li85,Mc80].  In  this  section  we  assume  that  T  is  some  propositional  theory 
and  that  there  is  a  partition  of  all  the  atoms  in  T  into  three  disjoint  sets  of  atoms:  P,Z, 
and  Q. 

Definition  4.1  [GPP89].  For  any  two  models  Af  and  of  P  we  write  MSATmod  (P,Z) 
if  models  M  and  N  differ  only  on  how  they  interpret  predicates  from  P  and  Z  and  if 
pos(M)nP  is  a  subset  of  pos(A^)nP.  We  say  that  a  model  M  is  (P,zyminimal  if  there  is 
no  model  N  such  that  N<M  mod  (P,Z}  (i.e.  such  that  N<M hut  not  M<N)- 

That  is,  in  order  for  M  to  be  (P,Z)-minimal,  the  following  must  hold:  for  every 
model  N  such  that  M  and  A  grant  the  same  truth  value  to  all  the  atoms  in  Q,  the  set 
of  all  atoms  in  P  to  which  M  assigns  true  must  be  a  subset  of  the  set  of  all  atoms  in  P 
to  which  N  assigns  true;  and  we  don’t  care  about  the  truth  assignment  these  models 
give  to  atoms  in  Z. 

Theorem  4.2  [GPP89].  For  any  clause  c,  we  say  that  c  follows  from  (T,P,Z)  by 
circumscription  if  and  only  if  c  is  true  in  every  (P,Z)-minimal  model  of  T. 

Example  4.3  Assume  T  is  the  following  theory,  having  the  intuitive  meaning  that 
children  normally  like  McDonald’s,  and  Itamar  is  a  child: 

Child  A  — lAb  — >  LikcsMD. 

Child. 

Suppose  we  want  to  know  whether  Itamar  likes  McDonald’s.  The  intuition  is  that 
the  answer  is  yes.  Using  classical  logic,  LikesMD  does  not  follow  from  this  theory. 
However,  taking  P={Abj  and  Z=(likesMD}  we  get  that  LikesMD  follows  from 
(T,P,Z)  by  circumscription  .  To  see  this,  note  that  T  has  exactly  three  models: 
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Mj={ Child, Ab,LikesMD}M2={Child,Ab,  -^LikesMD), and  Child,  ^Ab,LikesMD }. 

is  the  only  ('P,Zj-minimal  model  of  T. 

The  algorithms  developed  here  can  be  used  for  a  demand-driven  computation  of 
(P,Z>minimal  models  of  T,  in  the  following  way: 

1.  Use  some  backtracking  algorithm  to  find  all  consistent  (with  7)  truth 
assignments  for  the  variables  in  Q 

2.  For  each  assignment  v  found  in  step  1,  compute  one  by  one  the  P-minimal 
models  of  T  U  Lit(v).  Each  model  generated  is  a  (P,Zj-minimal  model  of  T. 

A  demand-driven  computation  is  useful  here  because  it  may  help  us  refute 
conclusions  before  all  the  models  are  generated  (if  a  fact  does  not  follow  from  some 
model  it  certainly  does  not  follow  from  all  of  the  models). 


Fig.  6.  Growing  X  size 
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Fig.  7.  Comparing  symbols  order  heuristics 


5  Experiments 

We  have  tested  algorithm  J-X-min  algorithm  on  a  suite  including  hard  randomly 
generated  CNF  problems  (theories).  The  problems  are  3CNF  difficult  random 
problems  as  describe  in  [SKC94]  and  [SK96].  The  Algorithm  was  tested  on  theories 
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of  size  50/218  (50  symbols,  218  clauses).  The  theories  were  taken  from  the  SATLIB 
[SATOO].  The  algorithms  were  implemented  in  JAVA  on  a  PC  having  Pentium  3  600 
MH  processor  and  128  MB  memory.  We  have  chosen  JAVA  because  we  had  the 
intention  of  building  an  object-oriented  library  of  tools  for  computing  minimal 
models.  We  used  a  JAVA  code  of  walksat  [SKC94]  as  a  (incomplete)  SAT  procedure. 
Since  JAVA  is  a  relatively  ineffcient  language  in  terms  of  running  time,  we  did  not 
pay  much  attention  to  the  absolute  running  time  of  the  algorithms  in  these 
experiments.  However,  we  do  believe  that  running  time  is  an  important  factor  and  we 
plan  to  implement  the  algorithms  in  C  in  order  to  improve  their  time  performance. 

We  have  ran  three  experiments: 

1.  Compute  all  the  minimal  model  of  the  theories  and  compare  the  results  to 
results  of  a  complete  procedure  (the  dlv  system  [KL99] ). 

2.  Check  the  growth  in  run-time  as  a  function  of  an  increasing  size  of  X. 

3.  Compute  all  the  minimal  models  using  different  symbols  order  heuristics. 

The  first  experiment  has  shown  that  inspite  of  using  incomplete  sat  algorithms,  we 
have  succeeded  in  computing  all  and  only  the  minimal  models  of  the  theories.  The 
set  of  minimal  models  computed  by  our  algorithm  was  exactly  the  same  set  of 
minimal  models  computed  by  dlv.  We  expect  though  that  on  much  larger  theories  an 
incomplete  algorithm  will  be  less  accurate. 

Results  of  the  2"“  experiment  are  shown  in  Figure  6.  As  expected,  the  run-time  of 
the  algorithm  (given  in  seconds)  is  growing  as  X  grows.  It  is  encouraging  to  see  that 
the  first  X-minimal  model  is  generated  in  about  25%  of  the  time  it  takes  to  compute 
all  the  models,  since  one  of  the  goals  of  this  project  was  to  output  the  models  on  a 
demand-driven  basis. 

The  ordering  of  theory  variables  before  calling  algorithm  J-X-min  might  be  crucial. 
Once  enough  instantiations  are  made  so  that  the  theory  is  Horn,  a  linear  algorithm  can 
be  called  upon  to  finish  the  minimal  model  computation.  In  the  3"*  experiment  we 
have  computed  X-minimal  models  where  X  is  the  set  of  all  variables  in  the  theory.  On 
each  theory,  we  have  tested  the  J-X~min  five  times,  four  times  with  random  symbol 
order,  and  one  time  with  symbol  order  where  the  vertex  cover  of  the  theory  is  first  in 
the  ordering.  In  general,  the  problem  of  finding  a  minimum-cardinality  vertex  cover 
of  a  graph  is  NP-hard.  A  greedy  heuristic  procedure  that  we  used  for  finding  a  vertex 
cover  simply  removes  the  node  with  maximum  degree  from  the  graph  and  continues 
with  the  reduced  graph  until  all  nodes  are  disconnected.  The  set  of  all  nodes  removed 
is  a  vertex  cover. 

We  have  compared  the  run-time  results  of  J-X-min  in  vertex  cover  order,  and  in 
random  order.  For  the  random  order,  we  took  the  best, worst,  and  mean  run  time.  We 
have  divided  the  results  according  to  the  vertex  cover  of  the  theory.  The  results  of 
this  experiment  are  summarized  in  Figure  7  (run  time  is  in  seconds).  We  can  see  that 
in  general  the  run  time  of  J-X-min  does  not  grow  as  the  size  of  the  vertex  cover 
grows.  We  explain  this  findings  by  the  fact  that  we  use  the  walksat  algorithm.  When 
the  walksat  algorithm  is  checking  the  satisfiability  of  an  inconsistent  theory,  it  stops 
after  a  time-out  (measured  by  number  of  flips  and  restarts).  This  time-out  is  almost 
constant  and  hence  the  run  time  of  J-X-min  is  more  or  less  constant  on  theory  size 
with  different  vertex  cover  size.  We  can  see  that  the  vertex  cover  heuristics  is  quite 
good,  always  better  than  the  worst  and  mean  run-time  of  the  J-X-min  with  random 
order,  and  usually  even  better  than  the  best. 
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6  Related  Work 

During  the  last  few  years  there  have  been  several  studies  regarding  the  problem  of 
minimal  model  computation.  Ben-Eliyahu  and  Dechter  [BD96]  have  presented 
several  algorithms  for  computing  minimal  models,  all  of  them  different  from  the  ones 
proposeded  here.  One  limitation  of  the  algorithms  presented  there  is  that  they  produce 
a  superset  of  all  minimal  models  while  we  produce  the  minimal  models  one  by  one. 
Ben-Eliyahu  and  Palopoli  [BP97]  have  presented  a  polynomial  algorithm  for  finding 
a  minimal  model,  but  it  works  only  for  a  subclass  of  all  CNF  theories  and  it  Ends  only 
one  minimal  model. 

The  systems  dlv  [KL99]  and  smodels  [JNSOO]  compute  stable  models  of 
disjunctive  logic  programs.  If  integrity  constraints  are  flowed  in  the  programs,  every 
knowledge  base  can  be  represented  as  a  disjunctive  logic  program  such  that  the  set  of 
all  minimal  models  of  the  first  coincide  with  the  set  of  all  stable  models  of  the  second. 
An  advatage  of  our  approach  compares  to  theirs  is  that  our  algorithms  compute  X- 
minimal  models  one  at  a  time  while  using  their  approach  we  have  to  compute  first  all 
minimal  models  and  then  select  the  set  of  X-minimal  ones.  Another  difference  is  that 
our  implementaion  uses  local  search  techniques. 


7  Conclusions 

The  task  of  computing  X-minimal  models  is  very  relevant  in  many  knowledge 
representation  systems,  and  particularly  in  Diagnosis  and  Circumscription.  We  have 
presented  two  new  algorithms  to  perform  this  task.  The  algorithms  are  demand 
driven,  and  can  be  implemented  either  by  using  incomplete  local  search  procedures  of 
by  using  complete  procedures.  In  the  future  we  plan  to  combine  the  algorithms 
presented  here  with  the  algorithm  developed  by  [BeOO]  in  order  to  produce  a 
distributed  algorithm  for  computing  X-minimal  models. 
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Appendix 

In  the  following  text,  unless  otherwise  is  stated,  we  assume  some  fixed  theory  T  and 
some  fixed  subset  X  of  all  atoms  in  T,  where  \X\=x. 

We  take  two  truth  assignmnets  to  be  different  only  if  they  disagree  on  the  set  of 
atoms  X. 

The  question  we  want  to  raise  is:  What  is  the  maximum  number  of  different  X- 
minimal  models  that  such  a  theory  T  may  have? 

Definition  1;  An  assignments  (truth  assignments)  X-chain  for  X-chain  in  short)  is  an 
ordered  set  of  assignments  where  each  assignment  is  an  X-super  of  the  next 
assignment. 

Definition  2:  an  X-chain  Set  is  a  set  of  X-chains  such  that  each  possible  truth 
assignment  to  X  appears  in  exacdy  one  X-chain.  So  there  are  exactly  2'^x  assignments 
in  all  the  X-chains  all  together. 

We  define  as  chains  set  that  contains  exactly  r  assignments  chains. 

Lemma  1:  For  any  Theory  T  and  set  of  atoms  in  T,  X,  such  that  T  has  a  total  of  j 
different  X-minimal  models,  and  for  any  X-chain  set  Q  for  T,  r  >  j. 


334  Chen  Avin  and  Rachel  Ben-Eliyahu  -  Zohary 


Proof:  We  will  prove  by  contradiction  that  j  can’t  be  larger  than  r.  It  is  obvious  that 
two  different  Z-minimal  models  of  T  must  belong  to  a  different  X-chain  of  (one 
X-minimal  model  can’t  be  a  Super  of  another  X-minimal  model  by  definition  and 
therefore  can’t  be  in  the  same  X-chain).  If  7  >  r  then  there  must  be  two  different  X- 
minimal  models  that  belong  to  the  same  X-chain  in  C^.  A  contradiction. 

Lemma  2:  If  there  is  a  theory  T  having  exactly  j  X-  minimal  models  and  an  X-chain 
set  where  r=j  then  for  any  theory  T  and  for  any  set  of  atoms  X’  in  T”  such  that 
|X’|=|X|,  7”  may  have  at  most  j  X’-minimal  models. 

Proof:  It  is  easy  to  see  that  since  T  has  an  X-chain  set  of  size  r,  T  must  also  have  an  X’- 
chain  of  size  r.  Assume  that  T  has  X’ -minimal  models  with  j’>j.  By  Lemma  1,  r  >  j’.  But 
we  also  know  that  r=j,  so  we  get  that 7  >  j’.  A  contradiction. 


Theorem:  The  maximum  number  of  X-minimal  models  of  a  theory  is 


where  |X|=/z. 


Proof:  First,  we  will  show  that  there  is  some  theory  T  having  exactly  X-minimal 

IL2JJ 

models  for  some  subset  X  of  all  the  atoms  in  T  with  |X|=n.  We  will  define  T  to  be  the  theory 

^  n  ^ 

that  has  exactly  |  ^  I  models  where  each  model  has  a  different  set  of  —  true  atoms  that 


belong  to  some  fixed  set  X  of  atoms  with  |X|=n,  while  all  the  other  atoms  in  the  model  are 
assign  with  false.  In  this  case  each  model  is  also  an  x-minimal  model. 


Next  we  will  show  that  there  is  an  X-chain  set  of  size 


.  This  will  complete  the 


proof  because  it  means  (according  to  Lemma  2)  that  this  is  the  maximum  number  of  X-minimal 
models  that  a  theory  may  have.  We  will  divide  the  2"  different  assignments  to  X  into  the 
following  sets  which  reflect  the  number  of  atoms  in  X  assign  true  by  the  assignment: 


We  will  build  the  chains  in  the  set  as  follows.  We  start  with 


chains,  each  having  one 


assignment  that  belongs  to  the  set  .  We  then  add  the  assignments  in  the  set  to  the 


existing  chains,  possibly  starting  a  new  chain  ,  and  so  on.  The  X-chains  are  growing  by 
creating  complete  matching  in  bipartite  graphs  where  the  set  of  vertices  V  is  the  union  of  the 
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in  this  graph  reflect  the  X-super  relation  and  we  can  show  that  in  this  case  we  can  find  a 
complete  matching. 
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Abstract.  In  this  paper,  we  continue  to  explore  many-valued  disjunc¬ 
tive  logic  programs  with  probabilistic  semantics.  In  particular,  we  newly 
introduce  the  least  model  state  semantics  for  such  programs.  We  show 
that  many- valued  disjunctive  logic  programs  under  the  semantics  of  min¬ 
imal  models,  perfect  models,  stable  models,  and  least  model  states  can 
be  unfolded  to  equivalent  classical  disjunctive  logic  programs  under  the 
respective  semantics.  Thus,  existing  technology  for  classical  disjunctive 
logic  programming  can  be  used  to  implement  many- valued  disjunctive 
logic  programming.  Using  these  results  on  unfolding  many-valuedness, 
we  then  give  many-valued  fixpoint  characterizations  for  the  set  of  all 
minimal  models  and  the  least  model  state.  We  also  describe  an  iterative 
fixpoint  characterization  for  the  perfect  model  semantics  under  finite 
local  stratification. 


1  Introduction 

In  a  previous  paper  [5],  we  introduced  many- valued  disjunctive  logic  programs 
with  probabilistic  semantics.  In  particular,  we  defined  minimal,  perfect,  and  sta¬ 
ble  models  for  such  programs,  and  showed  that  they  have  the  same  properties 
like  their  classical  counterparts.  For  example,  perfect  and  stable  models  are  al¬ 
ways  minimal  models.  Under  local  stratification,  the  perfect  model  semantics 
coincides  with  the  stable  model  semantics.  Moreover,  we  also  showed  that  some 
special  cases  of  propositional  many- valued  disjunctive  logic  programming  under 
minimal,  perfect,  and  stable  model  semantics  have  the  same  complexity  as  their 
classical  counterparts. 

In  this  paper,  we  continue  this  line  of  research  on  many- valued  disjunctive 
logic  programming  with  probabilistic  semantics.  The  central  topic  of  the  present 
paper  is  to  elaborate  algorithms  for  many- valued  disjunctive  logic  programming. 
One  way  of  obtaining  such  algorithms  is  to  translate  many-valued  disjunctive 
logic  programs  into  classical  formalisms,  and  to  work  with  existing  algorithms 
for  the  classical  formalisms.  Another  way  is  to  simply  develop  completely  new 
algorithms. 

In  this  paper,  we  follow  both  directions.  We  first  show  that  many- valued  dis¬ 
junctive  logic  programs  under  minimal  models,  perfect  models,  stable  models, 
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and  least  model  states  can  be  unfolded  to  equivalent  classical  disjunctive  logic 
programs  under  the  respective  semantics.  Thus,  existing  technology  for  classical 
disjunctive  logic  programming  can  be  used  to  implement  many-valued  disjunc¬ 
tive  logic  programming. 

Using  these  results  on  unfolding  many- valuedness,  we  then  develop  new 
many- valued  fixpoint  characterizations  for  the  semantics  of  minimal  models, 
least  model  states,  and  perfect  models  under  finite  local  stratification. 

It  is  important  to  point  out  that  our  many- valued  disjunctive  logic  programs 
have  a  probabilistic  semantics  in  probabilities  over  possible  worlds.  Furthermore, 
the  truth  values  of  all  clauses  are  truth-functionally  defined  on  the  truth  val¬ 
ues  of  atoms.  This  gives  our  many-valued  disjunctive  logic  programs  both  nice 
computational  properties  (compared  to  purely  probabilistic  approaches)  and  a 
nice  probabilistic  semantics.  The  latter  is  expressed  in  the  fact  that  our  many¬ 
valued  disjunctive  logic  programming  under  the  minimal  model  and  the  least 
model  state  semantics  is  an  approximation  of  purely  probabilistic  disjunctive 
logic  programming. 

We  showed  in  [6,7]  that  many-valued  definite  logic  programming  with  this 
probabilistic  semantics  has  a  model  and  fixpoint  characterization  and  a  proof 
theory  similar  to  classical  definite  logic  programming.  Moreover,  special  cases 
of  many- valued  logic  programming  with  this  semantics  were  shown  to  have  the 
same  computational  complexity  as  their  classical  counterparts.  Interestingly,  our 
approach  in  [6,7]  is  closely  related  to  van  Emden’s  quantitative  deduction  [19], 
which  interprets  the  implication  connective  as  conditional  probability,  while  our 
work  uses  the  material  implication. 

The  main  contributions  of  this  paper  can  be  summarized  as  follows. 

•  We  introduce  the  least  model  state  semantics  for  positive  many- valued  dis¬ 
junctive  logic  programs  with  probabilistic  semantics. 

•  We  show  that  many-valued  disjunctive  logic  programs  under  minimal  model, 
perfect  model,  stable  model,  and  least  model  state  semantics  can  be  unfolded 
to  equivalent  classical  disjunctive  logic  programs  under  the  respective  seman¬ 
tics. 

•  We  provide  fixpoint  characterizations  for  the  set  of  all  minimal  models  and 
the  least  model  state  of  positive  many-valued  disjunctive  logic  programs. 

•  We  describe  an  iterative  fixpoint  characterization  for  the  perfect  model  of 
many- valued  disjunctive  logic  programs  that  have  a  finite  local  stratification. 

Note  that  proofs  of  all  results  are  given  in  the  extended  paper  [8]. 

2  Preliminaries 

In  this  section,  we  recall  some  necessary  definitions  and  results  from  [5]. 

2.1  Probabilistic  Background 

Let  ^  be  a  first-order  vocabulary  that  contains  a  set  of  function  symbols  and 
a  set  of  predicate  symbols  (as  usual,  constant  symbols  are  function  symbols  of 
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axity  zero).  Let  A'  be  a  set  of  variables.  We  define  terms  by  induction  as  follows. 
A  term  is  a  variable  from  X  or  an  expression  of  the  form  /(ti, , . . ,  where  / 
is  a  function  symbol  of  arity  A;  >  0  from  ^  and  are  terms.  We  define 

classical  formulas  by  induction  as  follows.  If  p  is  a  predicate  symbol  of  arity 
k  >  0  from  ^  and  ti, . . .  ,4  are  terms,  then  p(ti, . . . , 4)  is  a  classical  formula 
(called  atom).  If  F  and  G  are  classical  formulas,  then  also  -iF  and  (FaG). 
Literals,  positive  literals,  and  negative  literals  are  defined  as  usual.  We  define 
probabilistic  formulas  inductively  as  follows.  If  F  is  a  classical  formula  and  c  is  a 
real  number  from  [0, 1],  then  prob(F)  >  c  is  a  probabilistic  formula  (called  atomic 
probabilistic  formula).  If  F  and  G  are  probabilistic  formulas,  then  also  -*F  and 
{F  A  G).  We  use  {F  V  G)  and  {F  i—  G)  to  abbreviate  -i(-iF  A  -‘G)  and  “i(-iF  A  G), 
respectively,  and  adopt  the  usual  conventions  to  eliminate  parentheses.  Terms 
and  formulas  are  ground  iff  they  do  not  contain  any  variables.  Substitutions, 
ground  substitutions,  and  ground  instances  of  formulas  are  defined  as  usual. 

A  classical  interpretation  /  is  a  subset  of  the  Herbrand  base  over 
A  variable  assignment  a  assigns  to  each  a;  €  A'  an  element  from  the  Herbrand 
universe  HU ^  over  It  is  by  induction  extended  to  terms  by  (j(/(4, . . .  ,4))  — 
/(<^(ii),  •  ■ .  ,cr(4))  for  all  terms  /(ti, . . .  ,4).  The  truth  of  classical  formulas  F 
in  1  under  cr,  denoted  7  (=<^  F,  is  inductively  defined  as  follows  (we  write  I  \=  F 
when  F  is  ground): 

•  /  K P(^i,---,4)  iffp((r(4),...,a(4))  €  7. 

•  7  |=(y  “iF  iff  not  7  \=<r  F,  and  7  (F  A  G)  iff  7  f=<y  F  and  7  (=<y  G. 

A  probabilistic  interpretation  (or  p-interpretation)  p  =  (T,  p)  consists  of  a  set  I 
of  classical  interpretations  (called  possible  worlds)  and  a  discrete  probability 
function  p  on  Z  (that  is,  a  mapping  p  from  X  to  the  real  interval  [0, 1]  such  that 
all  p(7)  with  7  €  Z  sum  up  to  1  and  that  the  number  of  all  7  G  Z  with  /x(7)  >  0 
is  countable).  The  truth  value  of  a  formula  F  in  a  p-interpretation  p  under  a 
variable  assignment  c,  denoted  p^(F),  is  defined  as  the  sum  of  all  /i(7)  such  that 
7gZ  and  I  \=a  F  (we  write  p(F)  when  F  is  ground).  The  truth  of  probabilistic 
formulas  F  in  p  under  <j,  denoted  p  F,  is  defined  as  follows  (we  write  p  |=  F 
when  F  is  ground): 

•  P  ha  prob(F)  >  c  iff  p^(F)  >  c. 

•  P  ha  “'F  iff  not  p  |=a  F,  and  p  ha  (F  A  G)  iff  p  [=a  F  and  p  (=a  G. 

The  probabilistic  formula  F  is  true  in  p,  or  p  is  a  model  of  F,  denoted  p  \=  F^ 
iff  F  is  true  in  p  under  all  variable  assignments  cr.  The  p-interpretation  p  is  a 
model  of  a  set  of  probabilistic  formulas  F,  denoted  p  |=  iff  P  is  a  model  of  all 
F  G  F.  A  set  of  p-interpretations  P  is  a  model  of  F  (resp.,  F),  denoted  P  \=^  F 
(resp.,  P  h  •^)}  iff  every  member  of  P  is  a  model  of  F  (resp.,  F). 


2.2  Positively  Correlated  Probabilistic  Interpretations 

We  restrict  our  attention  to  the  following  kind  of  p-interpretations  (that  is,  we 
assume  another  axiom  besides  the  axioms  of  probability).  A  positively  correlated 
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probabilistic  interpretation  (or  pcp-interpretation)  is  a  p-interpretation  p  such 
that 


p{A  AB)  =  mm{p{A),p{B))  for  all  A,B  e  HB^  .  (1) 

Note  that  the  condition  p{A  A  B)  =  min(p(A),p(B))  is  just  assumed  for 
ground  atoms  A  and  B.  It  brings  probabilistic  logics  over  possible  worlds  closer 
to  truth-functional  logics.  We  do  not  assume  that  (1)  always  holds  in  the  part 
of  the  real  world  that  we  want  to  model.  The  axiom  (1)  is  simply  a  technical 
assumption  that  carries  us  to  a  form  of  many- valued  logic  programming  that 
approximates  probabilistic  logic  programming.  It  makes  a  global  probabilistic 
semantics  over  possible  worlds  match  with  the  truth-functionality  behind  logic 
programming  techniques.  Differently  from  many  other  axioms,  the  axiom  (1)  is 
compatible  with  logical  implication.  Note  that  pcp-interpretations  are  uniquely 
determined  by  the  truth  values  they  give  to  all  ground  atoms  [5] ,  and  thus  they 
can  be  identified  with  mappings  from  HB^  to  [0, 1]. 

A  probabilistic  formula  F  is  a  pc- consequence  of  a  set  of  probabilistic  formulas 
F,  denoted  T  1=^*^  F,  iff  each  pcp-interpretation  that  is  a  model  of  T  is  also  a 
model  of  F. 

2.3  Many- Valued  Disjunctive  Logic  Programs 

We  are  now  ready  to  define  many-valued  disjunctive  logic  programs.  We  start 
by  defining  many-valued  disjunctive  logic  program  clauses,  which  are  special 
atomic  probabilistic  formulas  that  are  interpreted  under  pcp-interpretations.  A 
many-valued  disjunctive  logic  program  clause  (or  mvd-clause)  is  a  probabilistic 
formula  of  the  kind 

prob(Ai  V  •  •  •  V  A/  ^  Bi  A  •  •  •  A  Bm  A  ->Ci  A  •  •  •  A  ^Cn)  >  c , 

where  Ai, . . . ,  JBi, . . . ,  Bm,  Ci, . . . ,  are  atoms,  l,m,n>  0,  and  c  e  [0, 1]  is 
rational.  It  is  abbreviated  by  (Ai  V  •  •  •  V  A^  -t—  Bi,. .  , 

not  Cn)[c,  1].  We  call  Ai  V  •  •  •  V  A/  its  head,  Bi, . . . ,  Bm,  not  Ci, . . . ,  not  Cn  its 
body,  and  c  its  truth  value.  It  is  positive  (resp.,  definite)  iff  n  =  0  (resp.,  I  —  1  and 
n  =  0).  It  is  called  an  integrity  clause  iff  Z  =  0,  a  fact  iff  /  >  0  and  m  +  n  =  0,  and 
a  rule  iff  ^  >  0  and  m  +  n  >  0.  A  many-valued  disjunctive  logic  program  (or  mvd- 
program)  F  is  a  finite  set  of  mvd-clauses.  A  positive  (resp.,  definite)  mvd-program 
is  a  finite  set  of  positive  (resp.,  definite)  mvd-clauses.  Given  an  mvd-program  F, 
we  identify  ^  with  the  vocabulary  ^(F)  of  all  function  and  predicate  symbols  in 
F.  Denote  by  HBp  the  Herbrand  base  over  ^(F),  and  by  ground{P)  the  set  of 
all  ground  instances  of  members  of  F  w.r.t.  ^(F).  The  set  of  truth  values  of  F, 
denoted  TV{P),  is  the  least  set  of  rational  numbers  •  •  • ,  -Ej} 

contains  all  the  rational  numbers  in  F,  where  n  >  2  is  a  natural  number.  Denote 
by  Ip  the  set  of  all  pcp-interpretations  over  HBp  into  TV(P). 

The  following  result  shows  that  the  truth  of  a  ground  mvd-clause  under 
a  pcp-interpretation  is  a  function  of  the  truth  values  of  the  contained  ground 
atoms. 
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Theorem  2.1.  Z,e«  C  =  (Ai  V  •  •  •  V  A;  <-  Bi, . . . ,  B^,  notCi,...,  not  C'„)[c,  1] 
be  a  ground  mvd-clause,  and  let  p  be  a  pep-interpretation.  Then,  p  is  a  model 
ofC  iff 

max( max  p(Ai),  max  pfCj))  >  c-l+  min  p(Bj) . 

l<i<l  l<i<n  l<i<m 

We  finally  define  queries  and  their  correct  and  tight  answers.  A  many-valued 
query  (or  simply  query)  is  an  expression  1],  where  F  is  a  ground  classical 

formula  and  t  is  a  variable  or  a  rational  number  from  [0, 1].  Given  the  queries 
3(F)  [c,  1]  and  3(F)  [a;,  1]  to  an  mvd-program  F,  where  c  e  [0, 1]  and  x  G  A',  we 
define  their  desired  semantics  in  terms  of  correct  and  tight  answers  with  respect 
to  a  set  M{P)  of  models  of  F  as  follows.  The  correct  answer  for  3(F)  [c,l] 
to  F  under  M(F)  is  Yes  if  c<  inf{p(F)|pG  M(F)}  and  No  otherwise.  The 
tight  answer  for  3(F)[x,l]  to  F  under  M{P)  is  the  substitution  0  =  {x/d}, 
where  d  =  inf{p(F)  |  p  G  M(F)}. 

In  the  rest  of  this  subsection,  we  recall  minimal,  perfect,  and  stable  models 
from  [5]  as  some  ways  of  describing  the  meaning  of  an  mvd-program. 

Minimal  Models.  For  pcp-interpretations  p  and  g,  we  say  p  is  a  subset  of  g, 
denoted  pCq^  iff  p(A)  <  q{A)  for  all  A  G  We  use  p  C  g  as  an  abbreviation 
for  p  C  g  and  p  ^  q.  A  model  p  of  an  mvd-program  F  is  a  minimal  model  of 
F  iff  no  model  of  F  is  a  proper  subset  of  p.  Denote  by  MM{P)  the  set  of  aXi 
minimal  models  of  F. 

Perfect  Models.  We  first  define  the  two  relations  and  :<  on  ground  atoms. 
For  an  mvd-program  F,  the  priority  relation  ^  and  the  auxiliary  relation  :<  are 
the  least  binary  relations  on  HBp  with  the  following  properties.  If  ground{P) 
contains  an  mvd-clause  with  the  atom  A  in  the  head  and  the  negative  literal 
note  in  the  body,  then  A  ^  C.  li  ground{P)  contains  an  mvd-clause  with  the 
atom  A  in  the  head  and  the  positive  literal  B  in  the  body,  then  A  <  B.  li 
ground{P)  contains  an  mvd-clause  with  the  atoms  A  and  A'  in  the  head,  then 
A:<  A'.  HA  ^  F,  then  A  :<  B,  H  A  :<  B  and  B  ^  C,  then  A^C.HA^B 
and  B  ^  C,,  then  A  ^  C.  H  A  ^  B  and  B  :<  C,  then  A  ^  C.  We  say  that  the 
ground  atom  B  has  higher  priority  than  the  ground  atom  A  iS  A  ^  B. 

We  next  define  the  preference  relation  <C  on  pcp-interpretations  as  follows. 
For  pcp-interpretations  p  and  g,  we  say  p  is  preferable  to  g,  denoted  p  ^  q, 
iff  p  ^  g  and  for  each  A  G  HBp  with  p{A)  >  q{A)  there  is  some  B  G  HBp  with 
q{B)  >  p{B)  and  A^  B.  We  write  p  ^  g  iff  p  g  or  p  =  g. 

A  model  g  of  an  mvd-program  F  is  a  perfect  model  of  F  iff  no  model  of  F  is 
preferable  to  g.  We  use  PM{P)  to  denote  the  set  of  all  perfect  models  of  F. 

Not  every  mvd-program  has  a  perfect  model.  We  next  define  locally  stratified 
mvd-programs  without  integrity  clauses,  which  always  have  a  perfect  model. 

An  mvd-program  F  without  integrity  clauses  is  locally  stratified  iff  HB p  can 
be  partitioned  into  sets  ,  F2 , . . .  (called  strata)  such  that  for  each  mvd-clause 

(Ai  V--- VAi  Bi,. . .  ^Bm^notCi,. . .  ^notCn)[c,l]  G  ground{P) , 
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there  exists  an  z  >  1  such  that  all  . . . ,  belong  to  ifj,  all  Bi, . . . ,  Bm  belong 
to  U  •  •  •  U  Bi,  and  all  <7i, . . . ,  belong  to  Bi  U  •  •  •  U  Hi-i.  For  such  a 
partition  Bi,  B2, . . .  of  HBp  (called  a  local  stratification  of  P)  and  every  i  >  1, 
we  use  Pi  to  denote  the  set  of  all  mvd-clauses  from  ground{P)  whose  heads 
belong  to  Bi. 

Stable  Models.  An  extended  many-valued  disjunctive  logic  program  clause 
(or  emvd-clause)  is  an  expression  (Ai  V  •  ♦  •  V  A/ ;  d  ■<—  Bi, . . . ,  B^,  not  Ci, . . . , 
not  C„)[c,  1],  where  Ai, . . . ,  A/,  Bi, . . . ,  Bm,  Ci, . . . ,  Cn  are  atoms,  m,  n  >  0,  cG 
[0, 1]  is  rational,  and  d  e  [0, 1],  It  is  true  in  a  pcp-interpretation  p  under  a  variable 
assignment  a  iff 

max(  max  max  p^{Ci),  rf)  >  c  -  1+  min  . 

l<i<l  l<t<n  l<t<m 

Thus,  emvd-clauses  may  also  contain  truth- value  constants  in  their  heads. 

For  an  mvd-program  P  and  a  pcp-interpretation  g,  the  expression  P/q  de¬ 
notes  the  set  of  emvd-clauses  that  is  obtained  from  ground (P)  by  replacing 
every  mvd-clause  (Ai  V  •  •  •  V  Af  ^  Bi , . . . ,  Bm,  not  Ci , . . . ,  not  Cn)[c,  1]  by  the 
emvd-clause 

(Ai  V  •  •  •  V  Ai ;  m^  q{Ci)  Bi, . . .  ,B^)[c,  1] . 

l<z<n 

A  pcp-interpretation  g  is  a  stable  model  of  an  mvd-program  B  iff  g  is  a 
minimal  model  of  P/q.  We  use  SM{P)  to  denote  the  set  of  all  stable  models 
of  B. 


2.4  Example 

We  now  give  an  illustrative  example.  The  following  mvd-program  B  is  taken 
from  [5]  (r,  s,  a,  6,  and  c  are  constant  symbols,  and  R,X,Y,  and  Z  are  variables): 

B  =  {{closed{r)\/ closed{s)  ^)[.5, 1],  (road(r,  a,  b)  )[.8, 1],  {road{s,  6,  c)  )[.7, 1], 
{reach{X,Y)  ^  road{R,XyY),  not  closed{R))[.9, 1], 

{reach{X,  Z)  ^  reach(X,  F),  reach{Y,  Z))[.9, 1]}  . 

The  set  of  truth  values  of  B  is  given  by  TV (B)  =  {0, 0.1, 

A  query  to  B  may  be  given  by  3(reac/i(a,  c))[l7, 1],  where  B  is  a  variable. 
To  determine  its  tight  answer,  we  must  specify  a  set  of  models  of  B.  Some 
models  Pi,  P2,  P3,  and  p^  of  B  are  shown  in  Table  1  (we  assume  Pj(A)  =  0  for 
all  unmentioned  A^HBp).  More  precisely,  the  models  Pi,  P2,  P3,  and  P4  are 
some  minimal  models  of  B,  whereas  the  models  Pi  and  p2  are  the  only  perfect 
and  stable  models  of  the  locally  stratified  mvd-program  B.  The  tight  answer  for 
3(reac/i(a, c))[B,  1]  to  B  under  {pi,P2?P3jB4}  and  {pi,p2}  is  given  by  {B/0} 
and  {B/0.5},  respectively. 
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Table  1.  Some  models  of  the  mvd-program  P 


closed{r)  closed{s)  road{r,a^h)  road{s^b^c)  Teach{a^h)  reach{b,c)  reach(a,c) 


Pi 

0.5 

0 

0.8 

0.7 

0.7 

0.6 

0.5 

P2 

0 

0.5 

0.8 

0.7 

0.7 

0.6 

0.5 

Ps 

0 

0.6 

0.8 

0.7 

0.7 

0 

0 

Pa 

0 

0-7 

0.8 

0.7 

0 

0 

0 

3  Least  Model  States 

We  now  define  least  model  states  for  positive  mvd-programs,  which  are  a  gener¬ 
alization  of  their  classical  counterparts  by  Minker  and  Rajasekar  [12,4]. 

In  the  sequel,  we  use  to  abbreviate  atomic  probabilistic  formulas  of  the 
form  prob(A)  >  a .  Given  an  mvd-program  P,  the  disjunctive  Herbrand  base 
for  P,  denoted  DHBp,  is  the  set  of  ail  disjunctions  of  atomic  probabilistic  for¬ 
mulas  V  •  ♦  •  V  with  pairwise  distinct  ground  atoms  Ai, . . . ,  €  PSp, 

ai, . . . ,  dfc  6  TV (P)\{0},  and  A:  >  1.  A  disjunctive  Herbrand  state  (or  stat^  S  is 
a  subset  of  DHBp.  A  state  S'  is  a  model  state  of  a  positive  mvd-program  P  iff 

{DeDHBp  |SUP|=P^  D}CS, 

A  model  p  of  a  state  S  is  a  minimal  model  of  S  iff  no  model  of  S  is  a  proper 
subset  of  p.  We  use  MM[S)  to  denote  the  set  of  all  minimal  models  of  S.  The 
canonical  form  (resp.,  expansion)  of  a  state  S,  denoted  can{S)  (resp.,  exp{S)), 
is  defined  by: 


can{S)  =  {DeS\\/D'eS,D'  ^D:  {D'}  D}  , 
exp{S)  =  {De  DHBp  \3D'gS:  {D'}  D}  . 

A  state  S  is  in  canonical  form  (resp.,  expanded)  iS  S  —  can{S)  (resp.,  S  — 
exp{S)). 

The  following  theorem  shows  that  the  intersection  of  a  set  of  model  states  of 
a  positive  mvd-program  P  is  also  a  model  state  of  P. 

Theorem  3.1.  Let  P  be  a  positive  mvd-program,  and  let  S  be  a  set  of  model 
states  of  P.  Then,  the  intersection  of  all  S  eS  is  a  model  state  of  P. 

Clearly,  each  positive  mvd-program  P  has  the  model  state  DHBp.  Thus, 
there  exist  model  states  of  P,  and  the  intersection  of  all  of  them  is  the  least 
model  state  of  P. 

Definition  3.2.  Denote  by  MSp  the  least  model  state  of  a  positive  mvd-pro¬ 
gram  P. 

The  following  result  shows  that  MS p  is  the  set  of  all  disjunctions  D  6  DHB p 
that  are  pc-consequences  of  P.  Moreover,  it  shows  that  this  set  coincides  with 
the  set  of  all  disjunctions  D  G  DHB p  that  Eire  true  in  all  minimal  models  of  P. 

Theorem  3.3.  Let  P  be  a  positive  mvd-program.  Then, 
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(a)  MSp  =  {D  €  DHBp  \  P  D}. 

(h)  MSp  =  {De  DHBp  I  MM{P)  \=  D}. 

As  shown  in  [6,7],  definite  mvd-programs  P  have  a  unique  least  model 
Mp.  The  next  theorem  shows  that  for  such  P,  the  model  Mp  corresponds  to 
can{MSp). 

Theorem  3.4.  Let  P  be  a  definite  mvd-program,  and  let  Mp  he  the  least  model 
of  P .  Then,  can{MS p)  =  Sp  where  Sp  =  {A^  E  DHB p  |  a  =  Mp{A)}. 

We  give  an  illustrative  example. 

Example  3.5.  Consider  the  following  positive  mvd-program  P: 

P  =  {(cZo5ed(r)Vc/o5ed(s)<— )[.5, 1],  (road(r,a,  6)^)[.8, 1],  (road(s,  6,c)<— )[.7, 1], 
[reach{X^Y)\/ clos^{R)  road(P,X,y))[.9, 1], 

[reachlx,  Z)  ^  reach{X,Y),reach{Y,Z))[.^,  1]}. 

The  set  of  truth  values  of  P  is  given  by  TV (P)  =  {0, 0.1, . . . ,  1}.  The  canonical 
form  of  the  least  model  state  MS p  of  P  is  given  as  follows: 

can{MSp)  =  {closedP'^ {t)\/ closedP'^ {s)^  roadP'^{r^a,h),  roadP'^ {s^h^c)^ 
reac/i®’^(a,  h)y closed^ {r)^  reoc/i°  ®(6,  c)Vc/osed®‘®(s), 
reac/i®‘®(a,  c)W closed^ {r)y  closedP'^{s)}  . 

4  Unfolding  Many- Valuedness 

In  this  section,  we  give  translations  of  mvd-programs  under  the  semantics  of  min¬ 
imal  models,  perfect  models,  stable  models,  and  least  model  states  into  classical 
disjunctive  logic  programs  under  the  respective  classical  semantics. 

4.1  Program  Translations 

We  now  formally  define  translations  of  mvd-programs  and  pep-interpretations 
into  classical  disjunctive  logic  programs  and  cl8issical  interpretations,  respec¬ 
tively. 

Given  an  mvd-program  P,  the  many-valued  alphabet  for  P,  denoted  ^^(P), 
is  obtained  from  ^(P)  by  replacing  each  predicate  symbol  p  by  the  new  predicate 
symbols  p"  with  a  G  TV (P)\{0}.  The  many-valued  Herbrand  base  for  P,  denoted 
HB'p^  is  the  Herbrand  base  over  ^’^(P).  For  atoms  A  =  p(ii,  • .  • , ijt)  and  a  E 
TV'(P),  the  atom  A"  over  0”^(P)  is  defined  as 

Every  mvd-program  P  is  translated  into  the  following  classical  disjunctive 
logic  program  Tr(P)  =  Tri(P)  U  Tr2(P)  over  ^"^(P)  (based  on  Theorem  2.1): 

lYi  (P)  =  {A?  V  •••  V  Af  ^  ,  not  Cf , ,  not  (7“  | 

{j4i  V  •  •  •  V  A|  <-  Pi, . . . ,  Pm,  not  Cl,...,  not  C„)[c,  1]  €  P, 
I3i,...,0m€  TV{P),  a  =  c-l  +  min(/?i,...,/3m)>0}, 

Tr2(P)  =  {A^  I  A“,  A^  e  HB^,  a  <  0}  . 
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Every  pcp-interpretation  p  is  translated  into  the  following  classical  interpreta¬ 
tion: 

TT{p)  =  {A-eHB^\p{A)>a}. 

The  following  example  illustrates  the  above  program  translation. 

Example  4.1.  The  mvd-program  P  given  in  Section  2.4  is  translated  into  the 
classical  disjunctive  logic  program  Tr(P)  =  Tri(P)  UTr2(P),  where  TVi(P)  is 
given  by: 

Tri(P)  =  { closed®'® (r)V dosed®'® (s)  <—  ;  road®'®(r,a, 6)  <—  ;  road^'^(s,b,c)  ; 
reach°\X,Y)<^road°-^{R,X,Y),notdosed°\Ry, 
reach°  '^{X,  Y)  road°-^{R, X,  Y),  not  dosed°-^(Ry, . . . ; 
reach°-^{X,  Y)  4-  road^{R,  X,  Y),  not  dosed®'® (P); 
reachP  \X,  Z)  ^  reach°  \X,  Y),  mid»®'®(y,  Z); 
reacft®'^(X,  Z)  4-  reach°  \X,  Y),  reach°-^{Y,  Z); 
reach°-^{X,  Z) «-  readj®'®(X,  Y),  reach°  \Y,  Z); 
reach°-^{X,  Z)  <-  reach°-^{X,  Y),  reach°\Y,  Z);...; 
reach°-^{X,  Z)  4-  reach}(X,  Y),  reach} {Y,  Z)} . 

Note  that  Tri(P)  may  be  quite  large.  It  generally  has  a  manageable  size  when 
there  are  few  truth  values  in  TV{P)  and  few  positive  hterals  in  the  bodies  of 
clauses  in  P, 

4.2  Unfolding  Results 

Minimal  Models.  The  following  lemma  shows  that  every  mvd-program  P  is 
equivalent  to  its  translation  Tr(P),  under  all  pcp-interpretations  into  TV{P). 

Lemma  4.2.  Let  P  be  an  mvd-program,  and  let  p  he  a  pcp-interpretation  into 
TV{P).  Then,  p  is  a  model  of  P  iff  p  is  a  model  o/Tr(P). 

The  next  lemma  shows  that  pcp-interpretations  p  into  TV (P)  can  be  identi¬ 
fied  with  their  translation  Tr{p),  concerning  classical  disjunctive  logic  programs 
over 

Lemma  4.3.  Let  P  be  an  mvd-program.  Let  L  be  a  classical  disjunctive  logic 
program  over  the  alphabet  and  letp  be  a  pcp-interpretation  into  TV{P). 

Then,  p  is  a  model  of  L  iffTr{p)  is  a  model  of  L, 

The  following  theorem  shows  that  Tr  translates  mvd-programs  under  the 
minimal  model  semantics  into  equivalent  classical  disjunctive  logic  programs 
under  the  minimal  model  semantics.  It  can  be  proved  using  the  two  lemmata 
above. 

Theorem  4.4.  Let  P  be  an  mvd-program,  and  let  p  he  a  pcp-interpretation. 
Then,  p  is  a  minimal  model  of  P  iffTr{p)  is  a  minimal  model  o/Tr(P). 
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Perfect  Models.  The  alphabet  ^q^{P)  is  obtained  from  ^{P)  by  replacing  each 
predicate  symbol  p  by  the  new  predicate  symbols  with  a  G  TV{P). 

We  slightly  modify  the  translation  of  mvd-programs  and  pcp-interpretations 
as  follows.  Every  mvd-program  P  is  translated  into  the  following  classical  dis¬ 
junctive  logic  program  Tr*(P)  =  Tr(P)  U  Tr3(P)  over  ^o^(P): 

Tr3(P)  =  \AeHBp}U{A^  ^  \AgHBp}. 

Every  pcp-interpretation  p  is  translated  into  the  following  classical  interpreta¬ 
tion: 

Tr*(p)  =  Tr(p)  U  {A^  \  A  e  HBp}  . 

Roughly  speaking,  the  next  lemma  shows  that  pcp-interpretations  p  into 
TV{P)  can  be  identified  with  their  translation  Tr*(p). 

Lemma  4,5.  Let  P  be  an  mvd-program.  Let  L  he  a  classical  disjunctive  logic 
program  over  the  alphabet  ^q^(P),  and  letp  be  a  pcp-interpretation  into  TV{P). 
Then,  p  is  a  model  of  L  iffTr*{p)  is  a  model  of  L\JTiz{P). 

The  following  theorem  shows  that  Tr*  translates  mvd-programs  under  the 
perfect  model  semantics  into  equivalent  classical  counterparts. 

Theorem  4.6.  Let  P  be  an  mvd-program,  and  let  p  he  a  pcp-interpretation. 
Then,  p  is  a  perfect  model  of  P  iffTr*{p)  is  a  perfect  model  o/Tr*(P). 

The  following  theorem  shows  that  the  translation  Tr(P)  of  a  locally  stratified 
mvd-program  P  is  also  locally  stratified. 

Theorem  4.7.  Let  P  be  an  mvd-program.  If  P  is  locally  stratified,  then  also 
Tr{P). 

The  next  theorem  shows  that  Tr  translates  locally  stratified  mvd-programs 
under  the  perfect  model  semantics  into  equivalent  classical  counterparts. 

Theorem  4.8.  Let  P  he  a  locally  stratified  mvd-program,  and  let  p  he  a  pcp- 
interpretation.  Then,  p  is  a  perfect  model  of  P  iff  Tr(p)  is  a  perfect  model  of 
Tt{P). 

Stable  Models.  For  classical  disjunctive  logic  programs  L  and  classical  in¬ 
terpretations  7,  denote  by  L/I  the  classical  Gelfond-Lifschitz  transform  of  L 
w.r.t.  I. 

The  next  lemma  shows  that  for  mvd-programs  P  and  pcp-interpretations  g, 
the  transform  P/q  is  equivalent  to  Tr{P)/Tr(g),  under  all  pcp-interpretations 
into  TV(P). 

Lemma  4.9.  Let  P  be  an  mvd-program,  and  let  p  and  q  be  two  pcp-interpre¬ 
tations  into  TV{P).  Then,  p  is  a  model  of  P/q  iffp  is  a  model  o/Tr(P)/Tr(g). 

The  next  theorem  shows  that  Tr  translates  mvd-programs  under  the  stable 
model  semantics  into  equivalent  classical  counterparts. 

Theorem  4.10.  Let  P  be  an  mvd-program,  and  let  p  be  a  pcp-interpretation. 
Then,  p  is  a  stable  model  of  P  iffTi{p)  is  a  stable  model  o/Tr(P). 
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Least  Model  States.  The  following  lemma  shows  that  every  mvd-program  P 
is  equivalent  to  its  translation  Tr(P),  concerning  disjunctive  Herbrand  states. 

Lemma  4.11.  Let  P  be  an  mvd-program,  and  let  S  be  a  state.  Then,  S  is  a 
model  state  of  P  iff  S  is  a  model  state  ofTr{P). 

The  following  theorem  shows  that  Tr  translates  an  mvd-program  into  a  clas¬ 
sical  counterpart  that  has  the  same  least  model  state. 

Theorem  4.12.  Let  P  be  an  mvd-program,  and  let  S  be  a  state.  Then,  S  is  the 
least  model  state  of  P  iff  S  is  the  least  model  state  o/Tr(P). 

5  Fixpoint  Chsiracterizations 

In  this  section,  we  provide  many- valued  fixpoint  characterizations  for  the  seman¬ 
tics  of  minimal  models,  least  model  states,  and  perfect  models  under  finite  local 
stratification. 


5.1  Minimal  Models  for  Positive  Programs 

We  now  give  a  fixpoint  characterization  for  the  set  of  all  minimal  models  of 
a  positive  mvd-program,  which  is  a  generalization  of  the  classical  counterpart 
given  in  [3,18]. 

In  the  sequel,  let  P  be  a  positive  mvd-program.  The  canonical  form  (resp., 
expansion)  of  a  set  of  pcp-interpretations  P,  denoted  can{P)  (resp.,  exp{P)),  is 
defined  by: 

can{P)  =  {pe  P\  ->3q  eP:  qCp}  , 
exp{P)  ={peIp\3qeP:qQp}. 

We  say  P  is  in  canonical  form  (resp.,  expanded)  iff  P=can{P)  (resp., 
P=exp{P)). 

The  fixpoint  operator  is  defined  on  the  complete  lattice  {8,  C),  where  8  is 
the  set  of  all  expanded  sets  of  pcp-interpretations,  and  P  C  Q  iff  Q  3  P  for  all 
P,Q  £8.  The  bottom  element  .L  is  the  set  of  all  pcp-interpretations,  and  the 
top  element  T  is  the  empty  set.  The  greatest  lower  bound  of  any  subset  of 
elements  is  the  union  of  the  elements  in  the  set,  and  the  least  upper  bound  is 
the  intersection  of  the  elements. 

The  operator  Tp  on  expanded  sets  of  pcp-interpretations  P  is  defined  by: 

T^{P)  =  {models p{statep{p))  \  p  e  P}, 

where  state p  and  modelsp  are  given  as  follows: 

statep(p)  =  {A^V  •  •  •  VAf  |  {AiV  ■  •  •  VAi  <— Pi, . . .  ,P^)[c,  l]eground(P), 

a  c  -  1  -h  min(p(Pi), . . .  ,p(Bm))  >  0}  , 
modelsp{S)  =  {q  ^  Ip  \  q  \=  S,  q  O  p)  . 

The  next  lemma  shows  the  immediate  result  that  T^  is  monotonic. 
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Lemma  5.1.  is  monotonic. 

We  now  define  the  powers  of  .  For  every  expanded  set  of  pcp-interpre- 
tations  P: 


T^ta(P) 


P 

-  T^{Tp‘na-l)(P)) 

_n{T"T/3(P)  I <  a} 


if  a  =  0; 

if  a  >  0  is  a  successor  ordinal; 
if  a  >  0  is  a  limit  ordinal. 


As  usual,  we  use  T^t  a  to  abbreviate  T^t 

The  following  lemma  shows  that  the  operator  is  not  continuous.  This 
result  is  immediate  by  the  fact  that  the  classical  counterpart  of  is  not  con¬ 
tinuous  [18]. 

Lemma  5.2.  is  not  continuous. 

Even  though  the  operator  Tp  is  not  continuous,  its  least  fixpoint  is  attained 
at  the  first  limit  ordinal.  This  is  shown  by  the  following  theorem,  which  follows 
fi:om  a  similar  result  for  classical  disjunctive  logic  programs  [18]. 

Theorem  5.3.  IfpiT^)  ~  . 

The  next  theorem  shows  that  the  set  of  minimal  models  of  P  is  given  by  the 
canonical  form  of  the  least  fixpoint  of 

Theorem  5.4.  MM{P)  =  can{lfp{Tp)) . 


5.2  Least  Model  States  for  Positive  Programs 

We  now  give  a  fixpoint  characterization  for  the  least  model  state  of  a  posi¬ 
tive  mvd-program,  which  is  a  generalization  of  the  classical  counterpart  given 
in  [12,4]. 

In  the  sequel,  let  P  be  a  positive  mvd-program.  We  now  identify  every  dis¬ 
junction  D  €  DHB  p  with  the  set  of  all  contained  atoms  6 

The  operator  Tp  on  expanded  disjunctive  Herbrand  states  S  is  defined  by: 

r^(5)  =  e2:p{{AfV---VAfVPiV--'VPrn  I  e  PPPp, 

(Ai V  • '  •  VA;  ^  Pi  A  •  •  ♦  A  Bm)[c,  1]  €  ground{P), 

Bf' V£»i, . . . ,  eS,a  =  c-l+  min(0u y9m)>0}) . 

The  following  lemma  shows  that  the  model  states  of  P  correspond  exactly 
to  the  pre-fixpoints  of  the  operator  Tp. 

Lemma  5.5.  Let  S  be  an  expanded  state.  Then,  S  is  a  model  state  of  P  iff 

r^(5)c  s. 

The  next  lemma  shows  that  the  operator  Tp  is  continuous.  This  result  follows 
immediately  from  the  continuity  of  the  classical  counterpart  of  Tp  [12]. 


Lemma  5.6.  Tp  is  continuous. 
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The  powers  of  Tf,  are  defined  as  usual:  For  all  Herbrand  states  5,  define 
Tp^uj{S)  as  the  union  of  all  r^tn(5)  with  n  <  lj,  where  r^T0(5)  =  S  and 
Tp]  (n  +  1){S)  =  Tp{Tp'\  n{S))  for  all  n  <  uj.  We  use  Tpja;  to  abbrevi¬ 
ate 

The  following  theorem  shows  that  the  least  model  state  of  P  coincides  with 
the  least  fixpoint  of  Tp,  and  that  the  least  fixpoint  is  attained  at  the  first  limit 
ordinal.  This  result  follows  immediately  from  Lemmata  5.5  and  5.6. 

Theorem  5.7.  MSp  =  ~  TpT^^- 

We  give  an  illustrative  example. 

Example  5.8.  Consider  again  the  positive  mvd-program  P  given  in  Exam¬ 
ple  3.5.  Its  least  model  state  MSp  is  given  by  u;  =  3: 

can{Tp'l  1)  =  Si  =  {closed^'^{r)\/closed^'^{s),  road^'^{r,a^b),  road®'^(s,6,  c)}, 
can(Tp|2)  =  82  =  5iU{reac/i°‘^(a,  6)Vc/osed®‘^(r),  reach^’\byc)\/closed^‘^{s)}, 
can(r^t3)  —  53  -  S2U{reach^'^{a,c)\/closed^‘\r)Vclosed^'^{s)}  . 

5.3  Perfect  Models  under  Finite  Local  Stratification 

We  now  give  an  iterative  fixpoint  characterization  of  perfect  models  of  mvd- 
programs  with  finite  local  stratification.  It  generalizes  the  classical  counterpart 
in  [18]. 

For  sets  of  emvd-clauses  P  and  sets  of  expanded  interpretations  P,  we  define: 

— _ 

^ p  {P)  =  U  {modelSp{statep{p))  |  p  €  P}  , 

where  modelsp  is  defined  as  in  Section  5.1  and  stdiep  is  given  by: 

~^p(p)  =  {A^V--^Af  \{AiV-^-VAi;d^Bi,.,,,Bm)[c,l]e  groundiP), 

Oi  =  c~l-^  min(p(Pi), . . .  ,p(P^))  >  d}  . 

The  following  theorem  formulates  the  iterative  fixpoint  characterization. 

Theorem  5.9.  Let  P  be  an  mvd-program  and  let  Hi,H2i . . .  ^Hn  be  a  finite  local 
stratification  of  P.  For  pcp-interpretations  p,  we  define: 

Pi{p)  -  Pi/p  U  {A<^  ^1  A^  e  p{A)  >  a}  . 

Then,  the  set  of  perfect  models  of  P  is  given  as  Pm  where 

Pi  =  can{T^ I  cj) , 

Pi  =  U  a;)  I  p  G  Pi-i}  for  all  z  €  {2, . . . ,  n}  . 
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6  Summary  and  Outlook 

We  introduced  least  model  states  for  many-valued  disjunctive  logic  programs. 
We  then  showed  how  to  unfold  many- valuedness  under  the  semantics  of  minimal 
models,  perfect  models,  stable  models,  and  least  model  states.  Thus,  existing 
technology  for  classical  disjunctive  logic  programming  can  be  used  to  implement 
many-valued  disjunctive  logic  programming.  Using  these  results,  we  gave  many¬ 
valued  fixpoint  characterizations  for  the  set  of  all  minimal  models  and  the  least 
model  state.  We  also  gave  an  iterative  fixpoint  characterization  for  the  perfect 
model  semantics  under  finite  local  stratification. 

An  interesting  topic  of  future  research  is  to  elaborate  other  semantics  for 
many- valued  disjunctive  logic  programs,  for  example,  to  define  partial  stable 
models.  Moreover,  it  would  be  very  interesting  to  work  out  fixpoint  characteri¬ 
zations  for  stable  (and  partial  stable)  models.  This  may  be  done  by  generalizing 
the  evidential  transformation  in  [2]  or  the  3-S  transformation  in  [17].  Finally, 
another  topic  of  future  research  is  to  elaborate  proof  theories  for  the  various 
semantics. 
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Abstract.  Considering  different  implication  operators,  such  as  Lukasie¬ 
wicz,  Godel  or  product  implication  in  the  same  logic  program,  naturally 
leads  to  the  allowance  of  several  adjoint  pairs  in  the  lattice  of  truth- 
values.  In  this  paper  we  apply  this  idea  to  introduce  multi-adjoint  logic 
programs  as  an  extension  of  monotonic  logic  programs.  The  continuity 
of  the  immediate  consequences  operators  is  proved  and  the  assumptions 
required  to  get  continuity  are  further  analysed. 


1  Introduction 

One  can  find  several  papers  in  the  literature  on  applications  of  definite  fuzzy 
logic  programming  which  are  based  either  on  Lukasiewicz,  or  product,  or  Godel 
implications  on  the  unit  real  interval  (an  overview  can  be  seen  in  [9]);  for  more 
complex  systems  it  is  reasonable  to  allow  room  for  several  different  implications. 
In  [2]  an  extension  was  presented  in  which  the  set  of  truth- values  is  generalised 
to  a  residuated  lattice  (in  order  to  embed  hybrid  probabilistic  logic  programs). 
Another  generalisation  of  the  set  of  truth-values  is  that  given  by  the  structure 
of  bilattice,  which  has  been  used  to  handle  negation  in  logic  programming  [5]. 

The  purpose  of  this  work  is  to  provide  a  further  generalisation  of  the  frame¬ 
work  given  in  [2,3]  so  that:  (1)  it  is  possible  to  use  a  number  of  different  implica¬ 
tions  in  the  rules  of  our  programs,  (2)  the  algebraic  requirements  on  residuated 
lattices  are  weaken  and  (3)  we  focus  on  the  continuity  of  the  immediate  conse¬ 
quences  operator  by  providing  sufficient  conditions  for  continuity. 

A  general  theory  of  logic  programming  which  allows  the  simultaneous  use 
of  different  implications  in  the  rules  and  rather  general  connectives  in  the  bod¬ 
ies  is  presented.  Models  of  these  programs  are  post-fixpoints  of  the  immediate 
consequences  operator,  which  is  proved  to  be  monotonic  under  very  general  hy¬ 
potheses. 
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The  final  part  of  the  paper  deals  with  the  continuity  of  the  immediate  conse¬ 
quences  operator,  which  is  proved  under  the  assumption  of  continuity  of  all  the 
operators  in  the  program  (but,  possibly,  the  implications).  This  theorem  is  also 
re-stated  in  terms  of  lower-semicontinuity  of  the  operators. 

2  Preliminary  Definitions 

We  will  make  extensive  use  of  the  constructions  and  terminology  of  universal 
algebra,  in  order  to  define  formally  the  syntax  and  the  semantics  of  the  languages 
we  will  deal  with.  A  minimal  set  of  concepts  from  universal  algebra,  which  will 
be  used  in  the  sequel  in  the  style  of  [2],  are  introduced  below. 

2.1  Some  Definitions  from  Universal  Algebra 

Definition  1  (Graded  set).  A  graded  set  is  a  set  Q  with  a  function  which 
assigns  to  each  element  lj  £  Q  a  number  n>0,  called  the  arity  of  u. 

Definition  2  (i7- Algebra).  Given  a  graded  set  Q,  an  i?-algebra  ^  is  a  pair 
{A,  I)  where  A  is  a  nonempty  set  called  the  carrier,  and  I  is  a  function  which 
assigns  maps  to  the  elements  of  Q  as  follows: 

1.  Each  element  uj  e  n>  0,  is  interpreted  as  a  map  I{cj):A^  — >  A,  denoted 
by 

2.  Each  element  c  £  Hq  (i.e.,  c  is  a  constant)  is  interpreted  as  an  element  /(c) 
in  A,  denoted  by  c^. 

Finally,  the  last  definition  needed  will  be  that  of  subalgebra  of  an  f2-algebra, 
which  generalises  the  concept  of  substructure  of  an  algebraic  structure.  The 
definition  is  straightforward. 

Definition  3  (Subalgebra  of  an  /^-algebra).  Given  an  Q -algebra  21=  (A,  I), 
an  l?-subalgebra  25,  is  a  pair  {B,  J),  such  that  B  C  A  and 

1.  J{c)  =  J(c)  for  all  c  £  Q^. 

2.  Given  u  £  then  J{uj):B^  —>5  is  the  restriction  o//(a;):  A”  A. 

2.2  Multi-adjoint  Semilattices  and  Multi-adjoint  Algebras 

The  main  concept  we  will  need  in  this  section  is  that  of  adjoint  pair,  firstly 
introduced  in  a  logical  context  by  Pavelka  [8],  who  interpreted  the  poset  structure 
of  the  set  of  truth- values  as  a  category,  and  the  relation  between  the  connectives 
of  implication  and  conjunction  as  functors  in  this  category.  The  result  turned 
out  to  be  another  example  of  the  well-known  concept  of  adjunction,  introduced 
by  Kan  in  the  general  setting  of  category  theory  in  1950. 

Definition  4  (Adjoint  pair).  Let  (P,  ::<)  be  a  partially  ordered  set  and  (^,&) 
a  pair  of  binary  operations  in  P  such  that: 
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(at)  Operation  Sz  is  increasing  in  both  arguments,  i.e.  if  xi,X2,y  C  P  such 
that  xi  ■<  X2  then  {xiSzy)  {x2Szy)  and  (ykxi)  :<  {yk,X2); 

(a2)  Operation  is  increasing  in  the  first  argument  (the  consequent)  and  de¬ 
creasing  in  the  second  argument  (the  antecedent),  i.e.  if  Xi,X2iy  ^  P  such 
that  xi  :<  X2  then  (xi  y)  z<  {x2  ^  y)  and  {y  X2)  (2/  ^  2:1); 

(aS)  For  any  x,y,z  £  P,  we  have  that  x  :<  {y  z)  holds  if  and  only  if  (x^z)  :< 
y  holds. 

Then  we  say  that  (<—,&)  forms  an  adjoint  pair  in  (P, 

The  need  of  the  monotonicity  of  operators  and  Sz  is  clear,  if  they  are  to 
be  interpreted  as  generalised  implications  and  conjunctions.  The  third  property 
in  the  definition,  which  corresponds  to  the  categorical  adjointness;  but  can  be 
adequately  interpreted  in  terms  of  multiple-valued  inference  as  asserting  that 
the  truth- value  oi  y  <r~  z  is  the  maximal  x  satisfying  x^z  :<p  y,  and  also  the 
validity  of  the  following  generalised  modus  ponens  rule  [6]: 

If  X  is  a  lower  bound  of  ^  9?,  and  2;  is  a  lower  bound  of  then  a  lower 

bound  y  oiij)  is  xhz. 

In  addition  to  (al)-(a3)  it  will  be  necessary  to  assume  the  existence  of  bottom 
and  top  elements  in  the  poset  of  truth- values  (the  zero  and  one  elements),  and  the 
existence  of  joins  (suprema)  for  every  directed  subset;  that  is,  we  will  assume 
a  structure  of  complete  upper-semilattice  (cus-lattice,  for  short)  but  nothing 
about  associativity,  commutativity  and  general  boundary  conditions  of  In 
particular,  the  requirement  that  (L,  T)  has  to  be  a  commutative  monoid  in  a 
residuated  lattice  is  too  restrictive,  in  that  commutativity  needn’t  be  required  in 
the  proofs  of  soundness  and  correctness  [9] .  Here  in  this  generality  we  are  able 
to  work  with  approximations  of  t-norms  and/or  conjunctions  learnt  from  data 
by  a  neural  net  like  in  [7]. 

Extending  the  results  in  [2,3,9]  to  a  more  general  setting,  in  which  different 
implications  (Lukasiewicz,  Godel,  product)  and  thus,  several  modus  ponens-like 
inference  rules  are  used,  naturally  leads  to  considering  several  adjoint  pairs  in 
the  lattice.  More  formally. 

Definitions  (Multi- Adjoint  Semilattice).  Let  {L,:<)  he  a  cus-lattice.  A 
multi-adjoint  semilattice  C  is  a  tuple  (L,  . . . ,  &;n)  satisfying  the 

following  items: 

(11)  (L,  :<)  is  bounded,  i.e.  it  has  bottom  (J_)  and  top  (T)  elements; 

(12)  (<— i,  &i)  is  an  adjoint  pair  in  (L,  :^}  for  i  =  I,...,  n; 

(13)  J  =  'd  =  '0  for  all 'd  e  L  for  i  I, ...  ,n. 

Remark  1.  Note  that  residuated  lattices  are  a  special  case  of  multi-adjoint  semi¬ 
lattice,  in  which  the  underlying  poset  has  a  cus-lattice  structure,  has  monoidal 
structure  wrt  (g)  and  T,  and  only  one  adjoint  pair  is  present. 

Prom  the  point  of  view  of  expressiveness,  it  is  interesting  to  allow  extra 
operators  to  be  involved  with  the  operators  in  the  multi-adjoint  semilattice.  The 
structure  which  captures  this  possibility  is  that  of  a  multi-adjoint  algebra. 
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Definition  6  (Multi- Adjoint  17- Algebra) .  Let  Q  be  a  graded  set  containing 
operators  and  Szi  for  i  —  1, . . . ,  n  and  possibly  some  extra  operators,  and  let 
£  =  (Lyl)  be  an  Q-algebra  whose  carrier  set  L  is  a  cus-lattice  under 

We  say  that  St  is  a  multi- adjoint  17-algebra  with  respect  to  the  pairs  (<-», 
for  i  ~  I, . . .  ,n  if  C  =  {L,  I{^n))  is  a  multi-adjoint 

semilattice. 


In  practice,  we  will  usually  have  to  assume  some  properties  on  the  extra  oper¬ 
ators  considered.  These  extra  operators  will  be  assumed  to  be  either  conjunctors 
or  disjunctors  or  aggregators. 

Example  1.  Consider  Q  =  {■<— p,  &;p,  <—(3,  Ax,,  @},  the  real  unit  interval  U  — 

[0, 1]  with  its  lattice  structure,  and  the  interpretation  function  I  defined  as: 

I{^p){x,y)  =  m\n{l,x/y)  I{kp){x,y)  =  x  •  y 

I{-g){x,  y)  =  {  J  I{kG){x,  y)  =  mm{x,  y) 

/(@)(a:,  y,z)  =  ^{x  -\-2y  3^:)  I{Al){x,  y)  =  max(0,  x~\-y  —  X) 

that  is,  connectives  are  interpreted  as  product  and  Godel  connectives,  a  weighted 
sum  and  Lukasiewicz  implication;  then  {U,I)  is  a  multi-adjoint  17-algebra  with 
one  aggregator  and  one  additional  conjunctor  (denoted  Al  to  make  explicit  that 
its  adjoint  implicator  is  not  in  the  language). 

Note  that  the  use  of  aggregators  as  weighted  sums  somehow  covers  the  ap¬ 
proach  taken  in  [1]  when  considering  the  evidential  support  logic  rules  of  com¬ 
bination. 

□ 


2.3  General  Approach  to  the  Syntax  of  Propositional  Languages 

The  syntax  of  the  propositional  languages  we  will  work  with  will  be  defined  by 
using  the  concept  of  17-algebra.  To  begin  with,  the  concept  of  alphabet  of  the 
language  is  introduced  below. 

Definition  7  (Alphabet).  Let  Q  be  a  graded  set,  and  U  a  countably  infinite 
set.  The  alphabet  Aq^jj  associated  to  17  and  U  is  defined  to  be  the  disjoint  union 
17  U  77  U  5,  where  S  is  the  set  of  auxiliary  symbols  and 

In  the  following,  we  will  use  only  Aq  to  designate  an  alphabet,  for  deleting  the 
reference  to  77  cannot  lead  to  confusion. 

Definition  8  (Expressions).  Given  a  graded  set  17  and  alphabet  Aq.  The  17- 
algebra  €  =  {Aq*,I)  of  expressions  is  defined  as  follows: 

1.  The  earner  Aq*  is  the  set  of  strings  over  Aq. 
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2.  The  interpretation  function  I  satisfies  the  following  conditions  for  strings 
a\  ^  •  j  fliijT,  in  • 

—  C(s  =  c,  where  c  is  a  constant  operation  (c£  Qq  ). 

-  =  Ljai,  where  w  is  an  unary  operation  (u)  £  f2i). 

-  =  (aia;a2),  where  u)  is  a  binary  operation  (to  €  1^2 

-  W(B{ai, . . . ,  ttn)  =  aj{aij . . . ,  a^),  where  u  is  a  n-ary  operation  (lj  £  Qn) 
and  n>  2. 

Note  that  an  expression  is  only  a  string  of  letters  of  the  alphabet,  that  is, 
it  needn’t  be  a  well-formed  formula.  Actually,  the  well-formed  formulas  is  the 
subset  of  the  set  of  expressions  defined  as  follows: 

Definition  9  (Well- formed  formulas).  Let  Q  be  a  graded  set,  II  a  countable 
set  of  propositional  symbols  and  €  the  algebra  of  expressions  corresponding  to 
the  alphabet  Lhe  well-formed  formulas  (in  short,  formulas)  generated  by 

Q  over  U  is  the  least  subalgebra  ^  of  the  algebra  of  expressions  (B  containing  U. 

The  set  of  formulas,  that  is  the  carrier  of  will  be  denoted  Fq.  It  is  well- 
known  that  leeist  subalgebras  can  be  defined  as  an  inductive  closure,  and  it  is 
not  difficult  to  check  that  it  is  freely  generated,  therefore  it  satisfies  the  unique 
homomorphic  extension  theorem  stated  below: 

Theorem  1.  Let  12  be  a  graded  set,  71  a  set  of  propositional  symbols,  5  the 
corresponding  Q-algebra  of  formulas.  Let  be  an  arbitrary  Q-algebra  with  car¬ 
rier  L.  Then,  for  every  function  1:11  L  there  is  a  unique  homomorphism 
J:  Fq  L  such  that: 

1.  For  all  p  £  n,  J(p)  =  J{p);^ 

2.  For  each  constant  c£  f2Q,  J(c5)  =  cz; 

3.  For  every  u  £  f2n  with  n>  0  and  for  all  Fi  £  Fq  with  i  =  l,...,n 

JMFu . . . ,  i^n))  -  u;z{JiF^), . . . ,  J(7^,)). 

3  Syntax  and  Semantics  of  Multi-adjoint  Logic  Programs 

Multi-adjoint  logic  programs  will  be  constructed  from  the  abstract  syntax  in¬ 
duced  by  a  multi- adjoint  algebra  on  a  set  of  propositional  symbols.  Specifically, 
we  will  consider  a  multi- adjoint  i?-algebra  £  whose  extra  operators  are  either 
conjunctors,  denoted  Ai, . . . ,  Afe,  or  disjunctors,  denoted  Vi, . . . ,  V^,  or  aggrega¬ 
tors,  denoted  @i, . . . ,  @m-  (This  algebra  will  host  the  manipulation  the  truth- 
values  of  the  formulas  in  our  programs.) 

In  addition,  let  77  be  a  set  of  propositional  symbols  and  the  corresponding 
algebra  of  formulas  ^  freely  generated  from  77  by  the  operators  in  i?.  (This 
algebra  will  be  used  to  define  the  syntax  of  a  propositional  language.) 

Remark  2.  As  we  are  working  with  two  f7-algebras,  and  to  discharge  the  nota¬ 
tion,  we  introduce  a  special  notation  to  clarify  which  algebra  an  operator  belongs 
to,  instead  of  continuously  using  either  ujz  or  Let  cj  be  an  operator  symbol 
in  17,  its  interpretation  under  £  is  denoted  Cj  (a  dot  on  the  operator),  whereas 
u)  itself  will  denote  when  there  is  no  risk  of  confusion. 
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3.1  Syntax  of  Multi-adjoint  Logic  Programs 

The  definition  of  multi-adjoint  logic  program  is  given,  as  usual,  as  a  set  of  rules 
and  facts.  The  particular  syntax  of  these  rules  and  facts  is  given  below: 

Definition  10  (Multi- Adjoint  Logic  Programs).  A  multi-adjoint  logic  pro¬ 
gram  is  a  set  P  of  rules  of  the  form  ((A  <^i  8),  such  that: 

1.  The  rule  (A  B)  is  a  formula  of 

2.  The  confidence  factor 'd  is  an  element  (a  truth-value)  of  L; 

3.  The  head  of  the  rule  A  is  a  propositional  symbol  of  11. 

4-  The  body  formula  B  is  a  formula  of  5  built  from  propositional  symbols 
5i, . . . ,  (n  >  0)  by  the  use  of  conjunctors  , &ti  and  Ai, . . . ,  Afc, 

disjunctors  Vi, . . . ,  V/  and  aggregators  @i, . . . ,  @7^  . 

5.  Facts  are  rules  with  body  T. 

6.  A  query  (or  goalj  is  a  propositional  symbol  intended  as  a  question  ?A  prompt¬ 
ing  the  system. 

Note  that  an  arbitrary  composition  of  conjunctors,  disjunctors  and  aggregators 
is  also  an  aggregator. 

Sometimes,  we  will  represent  the  above  pair  as  A  @[J5i, . . . ,  Bn],  where^ 
Bi,. . .  ,Bn  are  the  propositional  variables  occurring  in  the  body  and  @  is  the 
aggregator  obtained  as  a  composition. 


3.2  Semantics  of  Multi- adjoint  Logic  Programs 

Definition  11  (Interpretation).  An  interpretation  is  a  mapping  1:11  — >  L. 
The  set  of  all  interpretations  of  the  formulas  defined  by  the  Q-algebra  ^  in  the 
Q- algebra  L  is  denoted 

Note  that  by  the  unique  homomorphic  extension  theorem,  each  of  these  inter¬ 
pretations  can  be  uniquely  extended  to  the  whole  set  of  formulas  Fq. 

The  ordering  of  the  truth-values  L  can  be  easily  extended  to  the  set  of 
interpretations  as  usual: 

Definition  12  (Semilattice  of  interpretations).  Consider  two  interpreta¬ 
tions  /i,/2  €  Z^.  Then,  (Z£,C)  is  a  cus-lattice  where  Ii  C  I2  iff  Ii{p)  hip) 
for  allp  G  n.  The  least  interpretation  A  maps  every  propositional  symbol  to  the 
least  element  A.  of  L. 

A  rule  of  a  multi- adjoint  logic  program  is  satisfied  whenever  the  truth- value 
of  the  rule  is  greater  or  equal  than  the  confidence  factor  associated  with  the  rule. 
Formally: 

Definition  13  (Satisfaction,  Model).  Given  an  interpretation  I  G  Z^,  a 
weighted  rule  (A  ^  B,  ‘0)  is  satisfied  by  I  iff  "d  zA  I  [A  <~i  B) .  An  interpre¬ 
tation  I  ^Xz  is  a  model  of  a  multi-adjoint  logic  program  P  iff  all  weighted  rules 
in  P  are  satisfied  by  I. 


^  Note  the  use  of  square  brackets  in  this  context. 
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Note  the  following  equalities 

i{A  B)  =  i{A)  1(6)  =  I  (A)  I{B) 

and  the  evaluation  of  i{B)  proceeds  inductively  as  usual,  till  all  propositional 
symbols  in  B  are  reached  and  evaluated  under  /.  For  the  particular  case  of  a 
fact  (a  rule  with  T  in  the  body)  satisfaction  of  {A<-iT,  i9)  means 

d  <  /(A  T)  -  I{A)  T 

by  property  (aS)  of  adjoint  pairs  this  is  equivalent  to  T  :<  I (A)  and  this  by 
assumption  (13)  of  multi-adjoint  semilattices  gives  :<  I  {A). 

Definition  14.  An  element  X  e  L  is  a  correct  answer  for  a  program  P  and  a 
query  ?A  if  for  an  arbitrary  interpretation  1:11  ^  L  which  is  a  model  o/P  we 
have  X  1(A). 

4  Fix-Point  Semantics 

It  is  possible  to  generalise  the  immediate  consequences  operator,  given  by  van 
Emden  and  Kowalski  in  [4],  to  the  framework  of  multi-adjoint  logic  programs  as 
follows: 

Definition  15.  Let  ¥  be  a  multi- adjoint  logic  program.  The  immediate  conse¬ 
quences  operator  T^'.Xsi  — ^  mapping  interpretations  to  interpretations,  is 

defined  by  considering 

T^{I){A)  =  sup  I  A  A  B  e  P} 

Note  that  all  the  suprema  involved  in  the  definition  do  exist  because  L  is  assumed 
to  be  a  cus-lattice. 

As  it  is  usual  in  the  logic  programming  framework,  the  semantics  of  a  multi- 
adjoint  logic  program  is  characterised  by  the  post-fixpoints  of  T-p . 

Theorem  2.  An  interpretation  I  of  is  a  model  of  a  multi- adjoint  logic  pro¬ 
gram  ¥  iff  (I)  Q  I. 

Proof:  Assume  we  have  an  interpretation  I  for  the  program  P,  then  we  have  the 
following  chain  of  equivalent  statements  for  all  rule  A^iB  in¥ 

7?  i{A  B) 

^^i{A)^ii{B) 
^kii{B)^i(A)^I{A) 
sup{i9  &  /(B)  I  A  B  e  P}  X  /(A) 

Ti(I){A)  ^  /(A) 
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Thus,  if  /  is  a  model  of  F,  then  for  every  A  occurring  in  the  head  of  a  rule 
we  have  Tp(I)(A)  1(A),  If  A  is  not  the  head  of  any  rule,  we  have  T/(7)(A)  — 
sup0  —  ±  <  1(A)  and,  therefore,  /  is  a  post-fixpoint  for  Tp. 

Reciprocally,  assume  that  7  is  a  post-fixpoint  for  TS,  then  any  rule  A  5 
is  fulfilled. 

□ 

Note  that  the  fixpoint  theorem  works  even  without  any  further  assumptions 
on  conjunctors  (definitely  they  need  not  be  commutative  and  associative). 

The  monotonicity  of  the  operator  Tj^,  for  the  case  of  only  one  adjoint  pair, 
has  been  shown  in  [3].  The  proof  for  the  general  case  is  similar. 

Theorem  3  (Monotonicity  of  Tj^),  The  operator  Tp  is  monotonic. 

Proof:  Consider  7  and  J  two  elements  of  such  that  7  C  J.  We  have  to  show 
that 

Let  ^  be  a  propositional  symbol  in  71, 

Ti{I)(A)  =  \AtiBev] 

If  we  had  i(B)  <  j(B)  for  all  B^  then  we  would  also  have  '&^iI(B)  •< 
j(B)  for  all  i,  since  operators  are  increasing.  Now,  by  taking  suprema 

^p^(^)(^)  T^(J)(A)  for  all  A 

Therefore,  it  is  sufficient  to  prove  that  i(B)  :<  J(B)  for  all  B.  We  will  use 
structural  induction: 

If  B  is  an  atomic  formula,  then  it  is  obvious,  ie 

HB)  =  I{B)^J{B)  =  JiB) 

^  For  the  inductive  case,  consider  6  =  ©[Si, . . . ,  6„]  and  assume  that  /(Bj)  X 
J{Bi)  for  all  i  =  1, . . . ,  n.  By  definition  of  the  rules,  we  know  that  @  behaves  as 
S'ggregator,  and  therefore,  using  the  induction  hypothesis 

=  m 


□ 


Due  to  the  monotonicity  of  the  immediate  consequenees  operator,  the  se¬ 
mantics  of  P  is  given  by  its  least  model  which,  as  shown  by  Knaster-Tarski’s 
theorem,  is  exactly  the  least  fixpoint  of  7]^,  which  can  be  obtained  by  trans- 
finitely  iterating  from  the  least  interpretation  A. 
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The  proof  of  the  monotonicity  of  the  Tf  operator  in  [2]  is  accompanied  by 
the  following  statement,  surely  due  to  their  wanting  to  stress  the  embedding  of 
different  logic  programming  paradigms: 

The  major  difference  to  classical  logic  programming  is  that  our  may 
not  be  continuous,  and  therefore  more  than  countably  many  iterations 
may  be  necessary  to  reach  the  least  fixpoint. 

In  the  line  of  the  previous  quotation,  we  would  like  to  study  sufficient  conditions 
for  the  continuity  of  the  Tp  operator. 

5  On  the  Continuity  of  the  Operator 

A  first  result  in  this  approach  is  that  whenever  every  operator  in  O  turns  out 
to  be  continuous  in  the  lattice,  then  Tp  is  also  continuous  and,  consequently,  its 
least  fixpoint  can  be  obtained  by  a  countably  infinite  iteration  from  the  least 
interpretation. 

Let  us  state  the  definition  of  continuous  function  which  will  be  used. 

Definition  16.  Let  L  be  a  complete  upper  semilattice  and  let  f:L  L  be  a 
mapping.  We  say  that  f  is  continuous  if  it  preserves  suprema  of  directed  sets, 
that  is,  given  a  directed  set  X  one  has 

/(sup  X)  =  sup{/(a:)  |  x  €  A*} 

A  mapping  gxL"^  L  is  said  to  be  continuous  provided  that  it  is  continuous  in 
each  argument  separately. 

Definition  17.  Let  ^  be  a  language  interpreted  on  a  multi-adjoint  Q-algebra 
and  let  lj  be  any  operator  symbol  in  the  language.  We  say  that  uj  is  continuous 
if  its  interpretation  under  £>,  that  is  Cj,  is  continuous  in  L. 

Now  we  state  and  prove  a  technical  lemma  which  will  allow  us  to  prove  the 
continuity  of  the  immediate  consequences  operator. 

Lemma  1.  Let  ¥  be  a  program  interpreted  on  a  multi-adjoint  -algebra  £,  and 
let  B  be  any  body  formula  in  P.  Assume  that  all  the  operators  @  in  B  are  con¬ 
tinuous,  let  X  be  a  directed  set  of  interpretations,  and  write  S  =  sup  A;  then 

S{B)  =  sup{  J(5)  I  J  €  A} 

Proof:  Follows  by  induction.  ^ 

Theorem  4.  If  all  the  operators  occurring  in  the  bodies  of  the  rules  of  a  pro¬ 
gram  P  are  continuous,  and  the  adjoint  conjunctions  are  continuous  in  their 
second  argument,  then  Tp  is  continuous. 
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Proof:  We  have  to  check  that  for  each  directed  subset  of  interpretations  X  and 
each  atomic  formula  A 

T^{supX)iA)  =  sup{T#(J)(A)  I  J  €  X} 

Let  us  write  S  =  sup  X ,  and  consider  the  following  chain  of  equalities: 

Tp^(supX)(A)  =  sup{j9fc5(B)  I  ^  B  e  P} 

sup{,?fc  sup{J(B)  \j€X}\Al-iBeW} 

®  sup{t9&i  J{B)  1  J  e  X,  and  A  B  6  P} 

=  sup{sup{«?fe  J(B)  I  A  4-i  B  €  P}  I  J  e  X} 

=  sup{r|^(J)(A)  \JeX} 

where  equality  (1)  follows  from  Lemma  1  and  equality  (2)  follows  from  the  con¬ 
tinuity  of  the  operators  □ 

In  some  sense,  it  is  possible  to  reverse  the  implication  in  the  theorem  above. 

Theorem  5.  If  the  operator  is  continuous  for  all  program  P  on  then  any 
operator  in  the  body  of  the  rules  is  continuous. 

Proof:  Let  @  be  an  n-ary  connective.  Assume  an  ordering  on  defined  on 
components.  Denoting  a  tuple  (2/1?  ■  ■  -  ,2/n)  €  L”  as  y,  the  ordering  in  L”  is: 
^  <  f  iff  2/i  r<  2^1  for  i  =  1, . . . ,  n. 

Let  F  be  a  directed  set  in  L”,  and  let  us  check  that 

@(supy)  =rsup{@(2/i,...,2/„)  I  (2/i,...,2/n)  eF)} 

The  inequality 


sup{@(2/i,...,2/n)  I  (2/i,-..,2/n)  e  F}  ^  @(supF)  (1) 

foOows  directly  by  monotonicity  of  @  and  the  definition  of  supremum. 

For  the  other  inequality,  given  n  propositional  symbols  Ai, , . . ,  An  G  If  and  a 
tuple  y  =  (2/1, . . . ,  2/n)  G  consider  the  interpretation  ly  defined  as  /(Ai)  =  pi 
for  z  =  1, . . . ,  n  and  ±  otherwise.  This  way  we  have  ly  C  if  and  only  if  y  <  ^. 

Consider,  now,  the  set  Xy  of  interpretations  ly  for  all  y  e  and  also 
consider  its  supremum,  Sy  =  supXy.  By  the  ordering  in  L”  we  have,  for  all 
yeY 

(2/1?  •  •  ♦  j2/n)  =  (/y(Ai), .  . .  ,/y(An))  <  (5y(Ai),  .  ,  .  ,  ^^^(An)) 


therefore  we  have 


SUpF  <  (5y(Ai),...,5y(An)) 
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now,  by  the  monotonicity  of  @  we  have 

@(supF)  @  (5y(i4i), . .  .  ,5y(i4n))  =  5'y(@(Ai, . .  . ,  An))  (2) 
On  the  other  hand,  consider  the  program  P  below  consisting  of  only  a  rule 

by  the  assumption  of  monotonicity  of  Tp  we  have  the  following  chain  of  equalities 

5?(@(Ai, . . . ,  An))  =  T  hi  S?(@(Ai, . . . ,  An)) 

=  sup{i9  hi  Sy{J3)  \  a  B  £¥} 

=  T^(5y)(A) 

=  snp{T^^{Jy){A)\JyGXY} 

=  sup{sup{i9&i  Jyi^)  \  A  B  eF}  \  Jy  e  Xy} 

=  sup{T  hi  Jy(@(Ai, . . . ,  An))  I  Jy  €  Xy} 

=  SUp{@(Jy(Al),  .  .  .  ,  Jy(An))  1  Jy  e  Xy} 

==  sup{@(yi,...,2/n)  I  {yi,’“^yn)  e  y] 

Finally,  by  Eqns.  (2)  and  (1)  and  this  result  we  have 
sup{@(2/i,...,2/n)}  :<  @(supy)  ^(@(Ai, . . . , An))  =  sup{@(yi, . . . ,2/n)} 

□ 


Another  Approach  to  the  Continuity  of  Tf' 

It  is  possible  to  generalise  the  previous  theorem  by  requiring  weaker  continuity 
conditions  on  the  operators  but,  at  the  same  time,  restricting  the  structure  of 
the  set  of  truth- values. 

Definition  18.  Let  L  he  a  poset  and  f:L^  —^La  function.  We  say  that  f 
is  lower-semicontinuous,  for  short  LSC,  in  (i^i, . . . ,  ^?n)  ^  ^  < 

f{di, . . . , i?n)  there  exist  5i  for  i  =  I, ...  ,n  such  that  whenever  (/ii .  - . , Mn)  sat¬ 
isfies  Si  <  fii<  'di  then  e  <  /(//i, . . .  ,Mn)  <  fi'^u  •  •  ■  ji^n)- 

A  function  f  is  said  to  be  lower-semicontinuous  (or  LSC)  if  it  is  lower- 
semicontinuous  in  every  point  in  its  domain. 

It  is  obvious  that  the  composition  of  two  lower-semicontinuous  functions  is 
also  lower-semicontinuous. 

Definition  19.  A  cpo  L  is  said  to  satisfy  the  supremum  property  if  for  all  set 
X  C  L  and  for  all  e  we  have  that  if  e  <  sup  X  then  there  exists  S  ^  X  such  that 
e  <  5  <  supX. 
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Lemma  1  also  holds  assuming  LSC  and  the  supremum  property  and,  there¬ 
fore,  the  continuity  of  the  T-^  operator  is  obtained  from  the  combined  hypotheses 
of  LSC  of  the  operators  and  the  supremum  property  of  the  lattice  of  truth- values. 

Lemma  2.  Let  ^  he  a  program  interpreted  on  a  multi-adjoint  Q-algehra  £  whose 
carrier  has  the  supremum  property  for  directed  sets.  Let  B  he  any  body  formula 
in  P,  and  a  assume  that  all  the  operators  in  B  are  LSC.  Let  X  he  a  directed  set 
of  interpretations,  and  write  S  =  supX;  then 

S{B)  =  sup{  J(5)  I  J  €  X} 

Proof  sketch:  The  following  inequality  is  straightforward. 

sup{  J(i?)  \JeX}:<  S(B) 

Now,  assume  the  strict  inequality  and  get  a  contradiction,  using  LSC  and  the 
supremum  property  separately  on  each  argument  to  obtain  elements  Ji(B),  then 
^Pply  directedness  to  get  an  uniform  interpretation  Jq(B),  finally  use  once  again 
LSC  to  get  a  contradiction.  □ 

Theorem  6.  If  L  satisfies  the  supremum  property,  and  all  the  operators  in  the 

body  are  LSC  and  are  LSC  in  their  second  argument,  then  the  operator 
is  continuous. 

Proof:  Let  us  prove  that  for  a  directed  set  X  and  5  =  supX  we  have  that 
Ti{S){A)  =  sup{Tp^(J)(A)  I  J  6  X} 

by  showing  that  Tf{S){A)  fulfils  the  properties  of  a  supremum  for  the  set 
{T^{J){A)\Jex}. 

1.  Clearly,  by  monotonicity  of  the  operator  Tf  and  the  fact  that  S  =  supX, 
we  have  that  T^{S){A)  is  an  upper  bound  for  all  the  T^{J){A)  with  J  eX 
and,  therefore 

sup{rp^(J)(A)  I  J  e  X}  Ti{S){A) 

2.  Reasoning  by  contradiction,  assume  the  strict  inequality 

sup{T^{J){A)  I  J  G  X}  ^  T^{S){A) 

As  Tp{S){A)  ~  sup{t9&i5(5)  |  A  5  G  P}  by  the  supremum  property 

taking  e  =  sup{T|^(  J)(A)  |  J  G  X}  we  have  that  there  exist  a  rule  A^iB  G  P 
such  that 


sup{Tp^(J)(A)  I  J  e  X}  =  e  5(B)  :<  T^(5)(A) 

By  using  lower-semicontinuity  of &  j .  on  the  strict  inequality,  we  have  that 
there  exists  6  ■<  S{B)  such  that  whenever  <5  A  X  5(B)  then  £  -<  A 
t?fcB(B). 
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Now,  by  Lemma  2,  we  have  that  S{B)  ~  sup{J(5)  |  J  €  X},  we  can  apply 
once  again  the  supremum  property  and  select  an  element  Jq  E  X  such  that 
S  -<  Jo{B)  :<  S{B).  For  this  element,  by  LSC  of  _  we  have  that 

But  this  is  contradictory  with  the  fact  that  e  =  sup{Tp{J){A)  \  J  E  X}  = 
sup{i?  &i  J{B)  I  ^  B  e  P  and  J  £  X}. 

□ 


6  Conclusions  and  Future  Work 

We  have  presented  a  general  theory  of  logic  programming  which  allows  the  simul¬ 
taneous  use  of  different  implications  in  the  rules  and  rather  general  connectives 
in  the  bodies. 

We  have  shown  that  models  of  our  programs  are  post-fixpoints  of  the  im¬ 
mediate  consequences  operator  Tp  ,  and  the  it  is  monotonic  under  very  gen¬ 
eral  hypotheses.  In  addition  we  have  proved  the  continuity  of  under  the 
assumption  of  continuity  of  the  operators  in  the  language  (but,  possibly,  the 
implications).  This  hypothesis  of  continuity  of  the  operators  can  be  relaxed  to 
lower-semicontinuity,  whenever  we  are  working  with  a  cus-lattice  with  the  supre¬ 
mum  property.  As  future  work  we  are  planning  to  develop  a  complete  procedu¬ 
ral  semantics  for  multi- adjoint  programs  and  further  investigate  lattice  with  the 
supremum  property. 
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Abstract,  According  to  Dynamic  Logic  Programming  (DLP),  knowl¬ 
edge  may  be  given  by  a  sequence  of  theories  (encoded  as  logic  programs) 
representing  different  states  of  knowledge.  These  may  represent  time 
(e.g.  in  updates),  specificity  (e.g.  in  taxonomies),  strength  of  updating 
instance  (e.g.  in  the  legislative  domain),  hierarchical  position  of  knowl¬ 
edge  source  (e.g.  in  organizations),  etc.  The  mutual  relationships  extant 
among  states  are  used  to  determine  the  semantics  of  the  combined  the¬ 
ory  composed  of  all  the  individual  theories.  Although  suitable  to  encode 
a  single  dimension  (e.g.  time,  hierarchies...),  DLP  cannot  deal  with  more 
than  one  simultaneously  because  it  is  defined  only  for  a  linear  sequence 
of  states.  To  overcome  this  limitation,  we  introduce  the  notion  of  Multi¬ 
dimensional  Dynamic  Logic  Programming  (AdDCV),  which  generalizes 
DLP  to  collections  of  states  organized  in  arbitrary  acyclic  digraphs  rep¬ 
resenting  precedence.  In  this  setting,  MV  CP  assigns  semantics  to  sets 
and  subsets  of  such  logic  programs.  By  dint  of  this  natural  generalization, 
MVCP  affords  extra  expressiveness,  in  effect  enlarging  the  latitude  of 
logic  programming  applications  unifiable  under  a  single  framework.  The 
generality  and  flexibility  provided  by  the  acyclic  digraphs  ensures  a  wide 
scope  and  variety  of  application  possibilities. 


1  Introduction  and  Motivation 

In  [1],  the  paradigm  of  Dynamic  Logic  Programming  (DLP)  was  introduced,  fol¬ 
lowing  the  eschewing  of  performing  updates  on  a  model  basis,  as  in  [8,15,16,19], 
but  rather  as  a  process  of  logic  programming  rule  updates  [13]. 

According  to  Dynamic  Logic  Programming  (DLP),  itself  a  generalization  of 
the  notion  of  the  update  of  a  logic  program  P  by  another  one  U,  knowledge  is 
given  by  a  series  of  theories  (encoded  as  generalized  logic  programs)  representing 
distinct  supervenient  states  of  the  world.  Different  states,  sequentially  ordered, 
can  represent  different  time  periods  [1],  different  agents  [9],  different  hierarchi¬ 
cal  instances  [17],  or  even  different  domains  of  knowledge  [12].  Consequently, 
individual  theories  may  comprise  mutually  contradictory  as  well  as  overlapping 
information.  The  role  of  DLP  is  to  employ  the  mutual  relationships  extant  among 
different  states  to  precisely  determine  the  declarative  as  well  as  the  procedural 
semantics  for  the  combined  theory  comprised  of  all  individual  theories  at  each 
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state.  Intuitively,  one  can  add,  at  the  end  of  the  sequence,  newer  or  more  specific 
rules  (arising  from  new,  renewly  acquired,  or  more  specific  knowledge)  leaving  to 
DLP  the  task  of  ensuring  that  these  added  rules  are  in  force,  and  that  previous 
or  less  specific  rules  are  still  valid  (by  inertia)  only  so  far  as  possible,  i.e.  that 
they  are  kept  for  as  long  as  they  are  not  in  conflict  with  newly  added  ones,  these 
always  prevailing.  The  common  feature  among  the  applications  of  DLP  is  that 
the  states  associated  with  the  given  set  of  theories  encode  only  one  of  several 
possible  representational  dimensions  (e.g.  time,  hierarchies,  domains,...). 

For  example,  DLP  can  be  used  to  model  the  relationship  of  a  group  of  agents 
related  according  to  a  linear  hierarchy,  and  DLP  can  be  used  to  model  the 
evolution  of  a  single  agent  over  time.  But  DLP,  as  it  stands,  cannot  deal  with 
both  settings  at  once,  and  model  the  evolution  of  one  such  group  of  agents  over 
time,  inasmuch  DLP  is  defined  for  linear  sequences  of  states  alone.  Nor  can  it 
model  hierarchical  relations  amongst  agents  that  have  more  than  one  superior 
(and  multiple  inheritance).  An  instance  of  a  multi-dimensional  scenario  is  legal 
reasoning,  where  legislative  agency  is  divided  conforming  to  a  hierarchy  of  power, 
governed  by  the  principle  Lex  Superior  (Lex  Superior  Derogat  Legi  Inferiori)  by 
which  the  rule  issued  by  a  higher  hierarchical  authority  overrides  the  one  issued 
by  a  lower  one,  and  the  evolution  of  law  in  time  is  governed  by  the  principle  Lex 
Posterior  (Lex  Posterior  Derogat  Legi  Priori)  by  which  the  rule  enacted  at  a 
later  point  in  time  overrides  the  earlier  one.  DLP  can  be  used  to  model  each  of 
these  principles  separately,  by  using  the  sequence  of  states  to  represent  either  the 
hierarchy  or  time,  but  is  unable  to  cope  with  both  at  once  when  they  interact. 

In  effect,  knowledge  updating  is  not  to  be  simply  envisaged  as  taking  place  in 
the  time  dimension  alone.  Several  updating  dimensions  may  combine  simultane¬ 
ously,  with  or  without  the  temporal  one,  such  as  specificity  (as  in  taxonomies), 
strength  of  the  updating  instance  {as  in  the  legislative  domain),  hierarchical  po¬ 
sition  of  the  knowledge  source  (as  in  organizations),  credibility  of  the  source  (as 
in  uncertain,  mined,  or  learnt  knowledge),  or  opinion  precedence  (as  in  a  society 
of  agents).  For  this  combination  to  be  possible,  DLP  needs  to  be  extended  to 
allow  for  a  more  general  structuring  of  states. 

In  this  paper  we  introduce  the  notion  of  Multi- dimensional  Dynamic  Logic 
Programming  ( MV  CP)  which  generalizes  DLP  to  cater  for  collections  of  states 
represented  by  arbitrary  directed  acyclic  graphs.  In  this  setting,  MV  CP  assigns 
semantics  to  sets  and  subsets  of  logic  programs,  depending  on  how  they  stand  in 
relation  to  one  another,  this  relation  being  defined  by  the  acyclic  digraph  (DAG) 
that  configures  the  states.  By  dint  of  such  a  natural  generalization,  MVCP 
affords  extra  expressiveness,  thereby  enlarging  the  latitude  of  logic  programming 
applications  unifiable  under  a  single  framework.  The  generality  and  flexibility 
provided  by  DAGs  ensures  a  wide  scope  and  variety  of  possibilities. 

The  remainder  of  this  paper  is  structured  as  follows:  in  Section  2  we  introduce 
some  background  definitions;  in  Section  3  we  introduce  MVCP  and  proffer  a 
declarative  semantics;  in  Section  4  some  illustrative  examples  are  presented; 
in  Section  5  an  equivalent  semantics  based  on  a  syntactical  transformation  is 
provided,  proven  sound  and  complete  wrt.  the  declarative  semantics;  in  Section 
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6  we  set  forth  some  basic  properties;  in  Section  7  we  conclude  and  open  the 
doors  of  future  developments. 


2  Background 

Generalized  Logic  Programs  and  Their  Stable  Models  To  represent  neg¬ 
ative  information  in  logic  programs  and  in  their  updates,  since  we  need  to  allow 
default  negation  not  A  not  only  in  premises  of  their  clauses  but  also  in  their 
heads,  we  use  generalized  logic  programs  as  defined  in  [1]^. 

By  a  generalized  logic  program  P  in  a  language  £  we  mean  a  finite  or  infinite 
set  of  propositional  clauses  of  the  form  Lq  •<—  Li , . . . ,  Ln  where  each  Li  is  a 
literal  (i.e.  an  atom  A  or  the  default  negation  of  an  atom  not  A).  If  r  is  a  clause 
(or  rule),  by  H{r)  we  mean  L,  and  by  B{r)  we  mean  Li, . . .  ,I/n*  If  i?(r)  =  A 
(resp.  H{r)  =  not  A)  then  notH{r)  =  not  A  (resp.  notH{r)  =  A).  By  a  (2- 
valued)  interpretation  M  of  C  we  mean  any  set  of  literals  from  C  that  satisfies 
the  condition  that  for  any  A,  precisely  one  of  the  literals  A  or  not  A  belongs 
to  M.  Given  an  interpretation  M  we  define  M'^  —  {A  :  A  is  an  atom,  A  G  M} 
and  M~  =  {not  A  :  A  is  an  atom,  not  A  G  M}.  Following  established  tradition, 
wherever  convenient  we  omit  the  default  (negative)  atoms  when  describing  in¬ 
terpretations  and  models.  We  say  that  a  (2- valued)  interpretation  M  of  £  is  a 
stable  model  of  a  generalized  logic  program  P  if  p{M)  =  least  (p(P)  U  p(M“)), 
where  p(.)  univocally  renames  every  default  literal  not  A  in  a  program  or  model 
into  new  atoms,  say  not -A.  The  class  of  generalized  logic  programs  can  be  viewed 
as  a  special  case  of  yet  broader  classes  of  programs,  introduced  earlier  in  [7]  and 
in  [14],  and,  for  the  special  case  of  normal  programs,  their  semantics  coincides 
with  the  stable  models  one  [6]. 


Graphs  A  directed  graphs  or  digraph,  D  =  {V,E)  is  a,  pair  of  two  finite  or 
infinite  sets  V  =  Vd  of  vertices  and  E  =  Ed  of  pairs  of  vertices  or  {directed) 
edges.  A  directed  edge  sequence  from  vq  to  in  a  digraph  is  a  sequence  of 
edges  61,62,  ...,6n  G  Ed  such  that  Ci  =  {vi^i,Vi)  for  i  =  1,  A  directed  path 
is  a  directed  edge  sequence  in  which  all  the  edges  are  distinct.  A  directed  acyclic 
graph,  or  acyclic  digraph  (DAG),  is  a  digraph  D  such  that  there  are  no  directed 
edge  sequences  from  v  to  v,  for  all  vertices  v  of  D.  A  source  is  a  vertex  with 
in-valency  0  (number  of  edges  for  which  it  is  a  final  vertex)  and  a  sink  is  a  vertex 
with  out- valency  0  (number  of  edges  for  which  it  is  an  initial  vertex).  We  say 
that  V  <  w  if  there  is  a  directed  path  from  v  to  w  and  that  v  <  w  if  v  <  w  or 
V  =  w.  The  transitive  closure  of  a  graph  D  is  a  graph  D'^  —  {V,  E'^)  such  that 
for  all  v,w  G  V  there  is  an  edge  {v,  w)  in  E'^  if  and  only  if  v  <  w  in  D.  The 
relevancy  DAG  of  a  DAG  D  wrt  a  vertex  v  of  D  is  =  {Vv,Ey)  where  K  = 
{vi  :  Vi  gV  and  Vi  <  v}  and  Ey  =  {{vi,Vj)  :  {vi,Vj)  G  E  and  Vi,Vj  GVy  }.  The 
relevancy  DAG  of  a  DAG  D  wrt  a  set  of  vertices  5  of  D  is  Ds  =  {Vs^Es) 


^  In  [2]  the  reader  can  find  the  motivation  for  the  usage  of  generalized  logic  programs, 
instead  of  using  simple  denials  by  freely  moving  the  head  not  s  into  the  body. 
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where  Vs  ~  ^  =  Uvg5-^vj  where  =  [Vy^Ey)  is  the  relevancy 

DAG  of  D  wrt  v. 

3  Multi-dimensional  Dynamic  Logic  Programming 

As  noted  in  the  introduction,  allowing  the  individual  theories  of  a  dynamic  pro¬ 
gram  update  to  relate  via  a  linear  sequence  of  states  only,  delimits  DLP  to 
represent  and  reason  about  a  single  aspect  of  a  system  (e.g.  time,  hierarchy,...). 
In  this  section  we  generalize  DLP  to  allow  for  states  represented  by  the  vertices 
of  a  DAG,  and  their  precedence  relations  by  the  corresponding  edges,  thus  en¬ 
abling  concurrent  representation,  depending  on  the  choice  of  a  particular  DAG, 
of  several  dimensions  of  an  updatable  system.  In  particular,  the  DAG  can  stand 
not  only  for  a  system  of  n  independent  dimensions,  but  also  for  inter-dimensional 
precedence.  In  this  setting,  MVCV  assigns  semantics  to  sets  and  subsets  of  logic 
programs,  depending  on  how  they  so  relate  to  one  another. 

We  start  by  defining  the  framework  consisting  of  the  generalized  logic  pro¬ 
grams  indexed  by  a  DAG.  Throughout  this  paper,  we  will  restrict  ourselves  to 
DAGs  such  that,  for  every  vertex  v  of  the  DAG^  any  path  ending  in  v  is  finite. 

Definition  1  (Multi-dimensional  Dynamic  Logic  Program).  Let  C  be  a 
propositional  language.  A  Multi-dimensional  Dynamic  Logic  Program  (MDLP), 
V,  is  a  pair  {Vd,D)  where  D  =  (V.E)  is  a  DAG  and  Vd  =  {Pv  :v  eV}  is 
a  set  of  generalized  logic  programs  in  the  language  C,  indexed  by  the  vertices 
V  &  V  of  D.  We  call  states  such  vertices  of  D.  For  simplicity,  we  often  leave  C 
implicit. 


3.1  Declarative  Semantics 

To  characterize  the  models  of  V  at  any  given  state  we  will  keep  to  the  basic 
intuition  of  logic  program  updates,  whereby  an  interpretation  is  a  stable  model 
of  the  update  of  a  program  jP  by  a  program  U  iff  it  is  a  stable  model  of  a  program 
consisting  of  the  rules  of  U  together  with  a  subset  of  the  rules  of  P,  comprised 
by  all  those  that  are  not  rejected  due  to  their  being  overridden  by  program  U 
i.e.  that  do  not  carry  over  by  inertia.  With  the  introduction  of  a  DAG  to  index 
programs,  a  program  may  have  more  than  a  single  ancestor.  This  has  to  be  dealt 
with,  the  desired  intuition  being  that  a  program  Py  G  Pd  can  be  used  to  reject 
rules  of  any  program  Py  G  Pd  if  there  is  a  directed  path  from  u  to  v.  Moreover, 
if  some  atom  is  not  defined  in  the  update  nor  in  any  of  its  ancestor,  its  negation 
is  assumed  by  default.  Formally,  the  stable  models  of  the  MDLP  are: 

Definition  2  (Stable  Models  at  state  s).  Let  P  =  (Pd,D)  be  a  MDLP, 
where  Pd  =  {Pt;  :veV}  and  D  =  {V,E).  An  interpretation  Ms  is  a  stable 
model  of  P  at  state  s  gV,  iff 
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Ms  =  least  {[Ps  -  Reject{s,  Ms)]  U  Default  {Vs,  Ms))  where  A  is  an  atom  and: 
'^S  =  Ui<S^^ 

Reject{s,  Ms)  =  {r  G  Pi  \  3r'  ePj,i<j<s,  H{r)  =  not  H{r')  AMs^  B{r')} 
Default  (Vs,  Ms)  =^{notA\$r€Vs:  (H{r)  =  A)  A  N  B(r)} 

Intuitively,  the  set  Reject{s,  Ms)  contains  those  rules  belonging  to  a  program 
indexed  by  a  state  i  that  are  overridden  by  the  head  of  another  rule  with  true 
body  in  state  j  along  a  path  to  state  s.  Vs  contains  all  rules  of  all  programs  that 
are  indexed  by  a  state  along  all  paths  to  state  s,  i.e.  all  rules  that  are  potentially 
relevant  to  determine  the  semantics  at  state  s.  The  set  Default  (Vs,  Mg)  contains 
default  negations  not  A  of  all  unsupported  atoms  A,  i.e.,  those  atoms  A  for  which 
there  is  no  rule  in  Vs  whose  body  is  true  in  Mg. 

Example  1.  Consider  the  diamond  shaped  MDLP  V  ~  (Vd,  D)  such  that  Vd  = 
{PuPuyPv,Pw}  and  D=  {{t,u,v,w},{{t,u),{t,v),{u,w),{v,w)})  where 

Pj  =  {d  <— }  Pu  —  {a  not  e}  Py  =  {not  a  d} 

P^  =  {not  a  ^  b]b<~-  not  c;  c  ^  not  b} 

The  only  stable  model  at  state  w  is  Myj  —  {not  a,  6,  not  c,  d,  not  e}.  In  fact,  we 
have  that  Reject{w,  My,)  =  {a  ^  not  e}  and  Default  (VwyMy,)  ~  {not  c,  not  e) 
and,  finally, 

\Py,  -  Reject{s^  My,)]  U  Default  (Vw,  My,)  = 

=  {d  not  a  <r-  d]  not  a  h]b  <—  not  c;  c  <—  not  6}  U  {not  c,  not  e} 

whose  least  model  is  My,.  My,  is  the  only  stable  model  at  state  w. 

The  next  proposition  establishes  that  to  determine  the  models  of  a  MDLP 
at  state  s,  we  need  only  consider  the  part  of  the  MDLP  corresponding  to  the 
relevancy  graph  wrt  state  s. 

Proposition  1.  Let  V  =  {Vd,D)  be  a  MDLP,  where  Vd  =  {Pv  eV)  and 
D  =  (VyE).  Let  s  be  a  state  in  V.  Let  V'  ~  {Vds,Ds)  be  a  MDLP  where  Dg  — 
(14, P5)  is  the  relevancy  DAG  of  D  wrt  s,  and  Vd,  —  {Pv  w^Vs).  M  is  a 
stable  model  of  V  at  state  s  iff  M  is  a  stable  model  of  V'  at  state  s. 

We  might  have  a  situation  where  we  desire  to  determine  the  semantics  jointly 
at  more  than  one  state.  If  all  these  states  belong  to  the  relevancy  graph  of  one 
of  them,  we  simply  determine  the  models  at  that  state  (Prop.  1).  But  this  might 
not  be  the  case.  Formally,  the  semantics  of  a  MDLP  at  an  arbitrary  set  of  its 
states  is  determined  by  the  definition: 

Definition  3  (Stable  Models  at  a  set  of  states  5).  Let  V  =  {Vd,D)  be  a 
MDLP,  where  Vd  ^  {Pv  '  v  ^  V}  CLnd  D  =  (V,  E).  Let  S  he  a  set  of  states  such 
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that  S  CV.  An  interpretation  Ms  is  a  stable  model  of  V  at  the  set  of  states  S 
iff  Ms  =  least  {[Vs  —  Reject{S^  Ms)]  U  Default  {Vs,  Ms))  where: 


RejectiS  Me)  -  /  ^  ^  I  ^  ^  <  3  <  s,\ 

^  H{r)  =  notH{r^)AMs^B{r^)  ] 

Default  {Vs,  Ms)  =  {notA\ir^Vs:  {H{r)  =  A)  A  Ms  1=  B(r)} 


This  is  equivalent  to  the  addition  of  a  new  vertex  a  to  the  DAG,  and  con¬ 
necting  to  a,  by  addition  of  edges,  all  states  we  wish  to  consider.  Furthermore, 
the  program  indexed  by  a  is  empty.  We  then  determine  the  stable  models  of  this 
new  MDLP  at  state  a.  In  Section  6,  we  provide  semantics  preserving  simplifica¬ 
tions  of  these  definitions,  according  to  which  only  a  subset  of  these  newly  added 
edges  is  needed.  Note  the  addition  of  state  a  does  not  affect  the  stable  models 
at  other  states.  Indeed,  a  and  the  newly  introduced  edges  do  not  belong  to  the 
relevancy  DAG  wrt.  any  other  state.  A  particular  case  of  the  above  definition  is 
when  5  =  F,  corresponding  to  the  semantics  of  the  whole  MDLP. 


4  Illustrative  Examples 

By  its  very  motivation  and  design,  MDCP  is  well  suited  for  combining  knowl¬ 
edge  arising  from  various  sources,  specially  when  some  of  these  sources  have 
priority  over  the  others.  More  precisely,  when  rules  from  some  sources  are  used 
to  reject  rules  of  other,  less  prior,  sources.  In  particular,  MVCV  is  well  suited 
for  combining  knowledge  originating  within  hierarchically  organized  sources,  as 
the  following  schematic  example  illustrates,  which  combines  knowledge  coming 
from  diverse  sectors  of  such  an  organization. 

Example  2.  Consider  a  company  with  a  president,  a  board  of  directors  and  (at 
least)  two  departments:  the  quality  management  and  financial  ones. 

To  improve  the  quality  of  the  products  produced  by  the  company,  the  quality 
management  department  has  decided  not  to  buy  any  product  whose  reliability 
is  less  than  guaranteed.  In  other  words,  it  has  adopted  the  rule^: 

nothuy{X)  not  reliable{X) 

On  the  other  hand,  to  save  money,  the  financial  department  has  decided  to 
buy  products  of  a  type  in  need  if  they  are  cheap,  viz. 

buy{X)  <r-  type{X,  T),  needed{T),  cheap{X) 

The  board  of  directors,  in  order  to  keep  production  going,  has  decided  that 
whenever  there  is  still  a  need  for  a  type  of  product,  exactly  one  product  of  that 
type  must  be  bought.  This  can  be  coded  by  the  following  logic  programming 
rules,  stating  that  if  X  is  a  product  of  a  needed  type,  and  if  the  need  for  that 

^  Rules  with  variables  stand  for  the  set  of  all  their  ground  instances. 
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Fig.  1. 


type  of  product  has  not  been  already  satisfied  by  buying  some  other  product  of 
that  type,  then  X  must  be  bought;  if  the  need  is  satisfied  by  buying  some  other 
product  of  that  type,  then  X  should  not  be  bought: 

buy{X)  ^  type{XyT),needed{T)^  not  sat  By  Other  (TyX) 
not  buy{X)  <—  type{X^  T)^needediT),  satBy Other ij',  X) 
satByOther{T^X)  type{Y,T),buy(X),X  ^  Y 

Finally,  the  president  decided  for  the  company  never  to  buy  products  that 
have  a  cheap  alternative.  I.e.  if  two  products  are  of  the  same  type,  and  only  one 
of  them  is  cheap,  the  company  should  not  buy  the  other: 

not  buy{X)  <—  type{X^  T),type{Y^  T),  X  ^  Y,  cfieap(Y),  not  cheap{X) 

Suppose  further  that  there  are  two  products,  a  and  6,  the  first  being  cheap 
and  the  latter  reliable,  both  of  type  t  and  both  of  needed  type  t. 

According  to  the  company’s  organizational  chart,  the  rules  of  the  president 
can  overrule  those  of  all  other  sectors,  and  those  established  by  the  board  can 
overrule  those  decided  by  the  departments.  No  department  has  precedence  over 
any  other.  This  situation  can  easily  be  modeled  by  the  MDLP  of  Figure  1. 

To  know  what  would  be  the  decision  of  each  of  the  sectors  about  which 
products  to  buy,  not  taking  under  consideration  the  deliberation  of  its  superiors, 
all  needs  to  be  done  is  to  determine  the  stable  models  at  the  state  corresponding 
to  that  sector.  For  example,  the  reader  can  check  that  at  state  QMD  there  is 
a  single  stable  model  in  which  both  not  buy  (a)  and  not  buy{b)  are  true.  At  the 


372  Joao  Alexandre  Leite  et  al. 


state  BD  there  are  two  stable  models:  one  in  which  buy  (a)  and  not  buy{b)  are 
true;  another  where  not  buy{a)  and  buy{b)  are  true  instead. 

More  interesting  would  be  to  know  what  is  the  decision  of  the  company  as 
a  whole,  when  taking  into  account  the  rules  of  all  sectors  and  their  hierarchical 
organization.  This  is  reflected  by  the  stable  models  of  the  whole  MDLP,  i.e.  the 
stable  models  at  the  set  of  all  states  of  the  MDLP.  The  reader  can  check  that, 
in  this  instance,  there  is  a  single  stable  model  in  which  buy{a)  and  not  buy{b) 
are  true.  It  coincides  with  the  single  stable  model  at  state  president  because  all 
other  states  belong  to  its  relevancy  graph.  n 

The  next  example  describes  how  AiBCV  can  deal  with  collision  principles, 
e.g.  found  in  legal  reasoning,  such  as  Lex  Superior  (Lex  Superior  Derogat  Legi 
Inferiori)  according  to  which  the  rule  issued  by  a  higher  hierarchical  authority 
overrides  the  one  issued  by  a  lower  one,  and  Lex  Posterior  (Lex  Posterior  Derogat 
Legi  Priori)  according  to  which  the  rule  enacted  at  a  later  point  in  time  over¬ 
rides  the  earlier  one,  i.e  how  the  combination  of  a  temporal  and  an  hierarchical 
dimensions  can  be  combined  into  a  single  MDLP. 

Example  3.  In  February  97,  the  President  of  Brazil  (PB)  passed  a  law  determin¬ 
ing  that,  in  order  to  guarantee  the  safety  aboard  public  transportation  airplanes, 
all  weapons  were  forbidden.  Furthermore,  all  exceptional  situations  that,  due  to 
public  interest,  require  an  armed  law  enforcement  or  military  agent  are  to  be 
the  subject  of  specific  regulation  by  the  Military  and  Justice  Ministries.  We  will 
refer  to  this  as  rule  1.  At  the  time  of  this  event,  there  was  in  force  an  internal 
norm  of  the  Department  of  Civil  Aviation  (DCA)  stating  that  “Armed  Forces 
Oflicials  and  Police  Oflicers  can  board  with  their  weapons  if  their  destination  is 
a  national  airport”.  We  will  refer  to  this  as  rule  2.  Restricting  ourselves  to  the 
essential  parts  of  these  regulations,  they  can  be  encoded  by  the  generalized  logic 
program  clauses: 


rulel  :  not  carry  jweapon  <—  not  exception 
rule2  :  carry Jweapon  armed-officer 

Let  us  consider  a  lattice  with  two  distinct  dimensions,  corresponding  to  the  two 
principles  governing  this  situation:  Lex  Superior  (di)  and  Lex  Posterior  (d2).  Be¬ 
sides  the  two  agencies  involved  in  this  situation  (PB  and  DCA),  we  will  consider 
two  time  points  representing  the  time  when  the  two  regulations  were  enacted. 
We  have  then  a  graph  whose  vertices  are  {{PB,  1),  {PB,  2),  {DCA,  1),  {DCA,  2)} 
(in  the  form  (agency, time))  as  portrayed  in  Fig.2.  We  have  that  Pdca,i  contains 
rule  2,  PpB,2  contains  rule  1  and  the  other  two  programs  are  empty.  Let  us 
further  assume  that  there  is  an  armed-officer  represented  by  a  fact  in  Pdca,1‘ 
Applying  Def.2,  for  Mpb,2  =  {not  carry  .weapon,  not  exception,  armed.officer} 
at  state  {PB,  2)  we  have  that: 

Reject{{PB,2),MpB,2)  =  {carry. weapon  armed.officer} 

Default  {Vpb,2>  ^pb,2)  ~  exception} 
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Fig.  2. 


it  is  trivial  to  see  that 

MpB,2  =  least  {[PpB,2  -  Reject{{PB,  2),  Mpb,2)]  U  Default  {'Ppb,2,  Mpb,2)) 

which  means  that  in  spite  of  rule  2,  since  the  exceptions  have  not  been  regulated 
yet,  rule  1  prevails  for  all  situations,  and  no  one  can  carry  a  weapon  aboard  an 
airplane.  This  would  correspond  to  the  only  stable  model  at  state  (PB,  2).  □ 

The  applicability  of  MVCV  in  a  multi-agent  context  is  not  limited  to  the  as¬ 
signment  of  a  single  semantics  to  the  overall  system,  i.e.,  the  multi-agent  system 
does  not  have  to  be  described  by  a  single  DAG.  Instead,  we  could  determine 
each  agent’s  view  of  the  world  by  associating  a  DAG  with  each  agent,  repre¬ 
senting  its  own  view  of  its  relationships  to  other  agents  and  of  these  amongst 
themselves.  The  stable  models  over  a  set  of  states  from  DAGs  of  different  agents 
can  provide  us  with  inter  agent  views. 

Example  4.  Consider  a  society  of  agents  representing  a  hierarchically  structured 
research  group.  We  have  the  Senior  Researcher  (Agr),  two  Researchers  {Ar\ 
and  Ar2)  and  two  students  {Asi  and  As2)  supervised  by  the  two  Researchers. 
The  hierarchy  is  deployed  in  Fig.3  a),  which  also  represents  the  view  of  the 
Senior  Researcher.  Typically,  students  think  they  are  always  right  and  do  not  like 
hierarchies,  so  their  view  of  the  community  is  quite  different.  Fig.3  b)  manifests 
one  possible  view  by  In  this  scenario,  we  could  use  AiDCP  to  determine 
and  eventually  compare  view,  given  by  the  stable  models  at  state  sr  in 

Fig.3  a),  with  Asps  view,  given  by  the  stable  models  at  state  si  in  Fig.3  b).  If 
we  assign  the  following  simple  logic  programs  to  the  five  agents: 

Psr  =  {a^b}  Psi  =  {not  a  ^  c}  P32  =  {}  Pri  =  {b}  Pr2  =  {c} 


a)  (P.,=-{nota~by)(p„=u) 


(Mi)  (PH=m) 


(p.,=i>>y)  (p^=<.cy)  (p,.=u  ~  ft>) 


(mD 


P  ,=-[nof  a  —  by) 


Fig.  3. 
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we  have  that  state  sr  in  Fig. 3  a)  has  Msr  =  {a,  6,  c}  as  the  only  stable  model, 
and  state  si  in  Fig.3  b)  has  Msi  =  {noia,  6,  c}  as  its  only  stable  model.  That 
is,  according  to  student  A^i’s  view  of  the  world  a  is  false,  while  according  to  the 
senior  researcher  Agr's  view  of  the  world  a  is  true.  □ 

This  example  suggests  MV  CP  to  be  a  useful  practical  framework  to  study 
changes  in  behaviour  of  such  multi-agent  systems  and  how  they  hinge  on  the 
relationships  amongst  the  agents,  i.e.  on  the  current  DAG  that  represents  them. 
MV  CP  oflFers  a  staple  basic  tool  for  the  formal  study  of  the  social  behaviour  in 
multi-agent  communities  [10]. 

5  Transformational  Semantics  for  MDLP 

Definition  2  above  establishes  the  semantics  of  MV  CP  by  characterizing  its 
stable  models  at  each  state.  Next  we  present  an  alternative  definition,  based  on 
a  purely  syntactical  transformation  that,  given  a  MDLP,  produces  a  generalized 
logic  program  whose  stable  models  are  in  a  one-to-one  equivalence  relation  with 
the  stable  models  of  the  MDLP  previously  characterized.  The  computation  of  the 
stable  models  at  some  state  s  reduces  to  the  computation  of  the  transformation 
followed  by  the  computation  of  the  stable  models  of  the  transformed  program. 
This  directly  provides  for  an  implementation  of  MVCP  (publicly  available  at 
centria.di.fct.-iml.pt/~jja/updates)  and  a  means  to  study  its  complexity. 

Without  loss  of  generality,  we  extend  the  DAG  D  with  an  initial  state  (sq) 
and  a  set  of  directed  edges  (sq,  5')  connecting  the  initial  state  to  all  the  sources 
of  D.  Similarly,  if  we  wish  to  query  a  set  of  states,  all  needs  doing  is  extending  the 
MDI^  with  a  new  state  a,  as  mentioned  before,  prior  to  the  transformation. 
By  C  we  denote  the  language  obtained  from  language  C  such  that  C  =  CU 
{A-,As,  AjMps^  Ap^,reject{As),reject{A-)  :  A  G  L,s  e  F U{so}}. 

Definition  4  (Multi- dimensional  Dynamic  Program  Update).  Let  P  he 
a  MDLP,  whereP  =  (Pd^D),  Pd -=  {Pv  :  v  e  V}  andD  =  {V,E).  Given  a  fixed 
state  s  e  V,  the  multi- dimensional  dynamic  program  update  over  P  at  state  s 
is  the  generalized  logic  program  which  consists  of  the  clauses  below  in  the 
extended  language  £,  where  Ds  =  is  relevancy  DAG  of  D  wrt  s: 

(RP)  Rewritten  program  clauses: 

■^Pv  ^  -^1  j  ♦  •  •  5  Rm-i  Gj  , . . . ,  Ap^  <—  B\ , . . . ,  Dm?  Gj  , . . . ,  G„ 

for  any  clause: 

A.  ^  Dx  ?  *  •  •  ?  Dm  ?  not  C\ ,  . . . ,  not  Cji 
respectively,  for  any  clause: 

not  A  Di,  ...,  Dm,  not  Cl,  ...,  notCn 


in  the  program  P^,  where  v  ^  Vs . 
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(IR)  Inheritance  rules: 

Ay  Au^  notreject{Au)  A~  <—  notreject{A~) 

for  all  atoms  Ae  C  and  all  {u,v)  e  Eg.  The  inheritance  rules  say  that  an  atom 
A  is  true  (resp.  false)  at  state  v  £  Vg  if  it  is  true  (resp.  false)  at  any  ancestor 
state  u  and  it  is  not  rejected. 

(RR)  Rejection  Rules: 

reject{A-)  <r-  Ap^  reject{Au)  <-  Ap^ 

for  all  atoms  Ae  C  and  all  u^v  eVg  where  u  <  v.  The  rejection  rules  say  that 
if  an  atom  A  is  true  (resp,  false)  in  the  program  Py,  then  it  rejects  inheritance 
of  any  false  (resp.  true)  atoms  of  any  ancestor. 

(UR)  Update  rules: 

Ay  < —  Ap^  Ay  <  Ap^ 

for  all  atoms  A£  C  and  all  v  £Vg.  Update  rules  state  that  atom  A  must  he  true 
(resp.  false)  at  state  v  £Vs  if  it  is  made  true  (resp.  false)  in  the  program  Py. 
(DR)  Default  Rules: 

for  all  atoms  A  £  C.  Default  rules  describe  the  initial  state  sq  by  making  all 
atoms  false  at  that  state. 

( CSg )  Current  State  Rules: 

A<r~  Ag  A~  <r~  Aj  not  A<—  Aj 

for  all  atoms  A  £  C.  Current  state  rules  specify  the  state  s  at  which  the  program 
is  being  evaluated  and  determine  the  values  of  the  atoms  A^A~  and  not  A. 

This  transformation  depends  on  the  prior  determination  of  the  relevancy 
graph  wrt.  the  given  state.  This  choice  was  based  on  criteria  of  clarity  and  read¬ 
ability.  Nevertheless  this  need  not  be  so:  one  can  instead  specify  declaratively,  by 
means  of  a  logic  program,  the  notion  of  relevancy  graph.  As  already  mentioned, 
the  stable  models  of  the  program  obtained  by  the  aforesaid  transformation  co¬ 
incide  with  those  characterized  in  Def.2,  as  expressed  by  the  theorem: 

Theorem  1.  Given  a  MDLP  P  =  the  generalized  stable  models  of 

fflsP,  restricted  to  C,  coincide  with  the  generalized  stable  models  of  V  at  state  s 
according  to  Def.2. 

For  lack  of  space,  we  do  not  present  the  proofs  of  the  Theorems.  In  [11],  the 
reader  can  find  an  extended  version  of  this  paper  containing  them. 
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6  Properties  of  MDLP 

In  this  section  we  study  some  basic  properties  of  M.'DLV. 

The  next  theorem  states  that  adding  or  removing  edges  from  a  DAG  of  a 
MDLP  preserves  the  semantics  if  the  transitive  closure  of  the  two  DAGs  is  the 
same  DAG.  In  particular,  it  allows  the  use  of  a  transitive  reduction  of  the  original 
graph  to  determine  the  stable  models. 

Theorem  2  (DAG  Simplification).  Let  V  =  {Vd,D)  be  a  MDLP,  where 
Vd  =  {Pv:v£  V}  and  D  =  (V,  E).  Let  Vi  =  {Vd,  D{)  he  a  MDLP,  where  Di  = 
(V,  El)  and  D'^  =  D^.  For  any  state  s  ^V,  M  is  a  stable  model  of  V  at  state 
s  iff  M  is  a  stable  model  of  Vi  at  state  s. 

One  consequence  of  this  theorem  is  that  in  order  to  determine  the  stable 
models  at  a  set  of  states  we  only  need  to  connect  to  the  new  node  a  the  sinks 
of  the  relevancy  DAG  wrt.  that  set  of  states. 

The  following  proposition  relates  the  stable  models  of  normal  logic  programs 
with  those  of  MDLPs  whose  set  of  programs  just  contains  normal  logic  programs. 

Proposition  2.  Let  V  =  be  a  MDLP,  where  Vd  =  {Pv  '  ^  ^  V}  o,nd 

D  =  (y,  E).  Let  S  CV  be  a  set  of  states  and  Ds  ~  (V^,  Es)  the  relevancy  DAG 
of  D  wrt.  S.  If  all  Py  :  V  e  Vs  are  normal  logic  programs,  then  M  is  a  stable 
model  of  V  at  states  S  iff  M  is  a  stable  model  of  the  (normal)  logic  program 
Urevs 

The  next  theorem  shows  that  M.DCP  generalizes  its  predecessor  DLP  [1], 

Theorem  3  (Generalization  of  DLP).  LetTn  =  {Ps  :  s  S}  be  a  DLP,  i.e. 
a  finite  or  infinite  sequence  of  generalized  logic  programs,  indexed  by  set  of  nat¬ 
ural  numbers  S  =  {1, 2, 3,  . . . ,  n, ...}.  Let  V  =  (Pp,  D)  be  the  MDLP,  where 
D  —  {S,  E)  is  the  acyclic  digraph  such  that  E  =  {(1, 2) ,  (2, 3) , ...,  (n  ~  1,  n) , ...}. 
Then,  an  interpretation  M  is  a  stable  model  of  the  dynamic  program  update 
(DLP)  at  state  s,  ^^Vd)  if  (^“^d  only  if  M  is  a  stable  model  ofV  at  state  s. 

Since  DLP  generalizes  Interpretation  Updates,  originally  introduced  as  “Re¬ 
vision  Programs”  by  Marek  and  Truszczyhski  [15],  then  so  does  MV  CP.  In  [1], 
DLP  was  defined  by  means  of  a  transformational  semantics  only.  Theorems  1 
and  3  establish  Def.  2  as  an  alternative,  declarative,  characterization  of  DLP. 


7  Conclusions  and  Future  Work 

We  have  introduced  M.DCP  as  a  generalization  of  DLP  in  allowing  for  collec¬ 
tions  of  states  organized  by  arbitrary  acyclic  digraphs,  and  not  just  sequences  of 
states.  And  therefore  assigning  semantics  to  sets  and  subsets  of  logic  programs, 
on  the  basis  of  how  they  stand  in  relation  amongst  themselves,  as  defined  by 
one  acyclic  digraph.  Such  a  natural  generalization  imparts  added  expressive¬ 
ness  to  updating,  thereby  amplifying  the  coverage  of  its  application  domains,  as 


Multi-dimensional  Dynamic  Knowledge  Representation  377 


we’ve  tried  to  illustrate  via  some  examples.  The  flexibility  afforded  by  a  DAG 
accrues  to  the  scope  and  variety  of  possibilities.  The  new  characteristics  of  mul¬ 
tiplicity  and  composition  of  M.VCP  may  be  used  to  lend  a  “societal”  viewpoint 
to  Logic  Programming.  Application  areas  such  as  legal  reasoning,  software  de¬ 
velopment,  organizational  decision  making,  multi-strategy  learning,  abductive 
planning,  model-based  diagnosis,  agent  architectures,  and  others,  have  already 
being  successfully  pursued  by  utilizing  MVCP. 

Other  frameworks  exist  for  updates  [20,18],  and  for  combining  logic  programs 
via  a  partial  order,  developed  for  purposes  other  than  updating.  Namely,  Dis¬ 
junctive  Ordered  Logic  [VOC)  [4],  itself  an  extension  of  Ordered  Logic,  and 
DLV^  [3],  a  language  that  extends  LP  with  inheritance.  Lack  of  space  prevents 
us  from  elaborating  on  the  comparison  with  these  frameworks,  so  we  defer  to  [5], 
where  some  considerations  are  made  to  that  effect. 

Some  of  the  more  immediate  themes  of  ongoing  work  regarding  the  further 
development  of  AADCV  comprise:  allowing  for  the  DAG  itself  to  evolve  by  up¬ 
dating  it  with  new  nodes  and  edges;  enhancing  the  LUPS  language  to  adumbrate 
update  commands  over  DA  Gs]  studying  the  conditions  for  and  the  uses  of  drop¬ 
ping  the  acyclicity  condition;  establishing  a  paraconsistent  AADCV  semantics 
and  defining  contradiction  removal  over  DAGs. 
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Abstract.  In  a  previous  work  we  have  defined  Monotonic  Logic  Pro¬ 
grams  which  extend  definite  logic  programming  to  arbitrary  complete  lat¬ 
tices  of  truth- values  with  an  appropriate  notion  of  implication.  We  have 
shown  elsewhere  that  this  framework  is  general  enough  to  capture  Gen¬ 
eralized  Annotated  Logic  Programs,  Probabilistic  Deductive  Databases, 
Possibilistic  Logic  Programming,  Hybrid  Probabilistic  Logic  Programs 
and  Fuzzy  Logic  Programming  [3,4].  However,  none  of  these  semantics 
define  a  form  of  non- monotonic  negation,  which  is  fundamental  for  sev¬ 
eral  knowledge  representation  applications.  In  the  spirit  of  our  previous 
work,  we  generalise  our  framework  of  Monotonic  Logic  Programs  to  allow 
for  rules  with  arbitrary  antitonic  bodies  over  general  complete  lattices, 
of  which  normal  programs  are  a  special  case.  We  then  show  that  all  the 
standard  logic  programming  theoretical  results  carry  over  to  Antitonic 
Logic  Programs,  defining  Stable  Model  and  Well-founded  Model  alike  se¬ 
mantics.  We  also  apply  and  illustrate  our  theory  to  logic  programs  with 
costs,  extending  the  original  presentation  of  [17]  with  a  class  of  negations. 


1  Introduction 

The  generalization  of  standard  logic  programming  to  many-valued  logics  has 
been  foreseen  for  a  quite  long  time  [20,8,9,14].  Substantial  work  and  results  have 
been  attained  in  the  field.  In  particular.  Logic  programming  literature  is  prodigal 
in  languages  and  semantics  proposals  for  extensions  of  definite  logic  programs 
(e.g.  [7,24,23,5,14]),  i.e.  those  without  non-monotonic  or  default  negation.  Usu¬ 
ally,  the  authors  characterize  their  programs  with  a  model  theoretic  semantics, 
where  a  minimum  model  is  guaranteed  to  exist,  and  a  corresponding  monotonic 
fixpoint  operator  (continuous  or  not). 

In  a  previous  work  [4],  which  we  recap  here,  we  abstracted  out  all  the  de¬ 
tails  and  defined  a  rather  general  framework  of  Monotonic  Logic  Programs  that 
captures  the  core  of  logic  programming.  In  the  present  paper  we  generalise  the 
framework  to  allow  for  rules  with  arbitrary  antitonic  bodies  over  general  com¬ 
plete  lattices,  of  which  normal  programs  are  a  special  case,  and  show  that  all  the 
standard  logic  programming  theoretical  results  carry  over  to  such  Antitonic  Logic 
Programs,  defining  for  them  Stable  and  Well-founded  Model  semantics  alike.  We 
also  apply  and  illustrate  our  theory  to  logic  programs  with  costs,  extending  the 
original  presentation  of  [17]  with  a  class  of  negations.  For  this  purpose  we  follow 
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an  algebraic  approach  to  the  language  and  the  semantics  of  logic  programs,  in 
the  same  spirit  of  [6]. 

Our  paper  proceeds  as  follows.  First  we  appeal,  for  motivation  and  examples, 
to  the  logic  programs  with  costs  of  [17],  where  associated  with  each  rule  there 
is  a  weight  or  cost  factor.  In  Section  3  we  recap  Monotonic  Logic  Programs  and 
supply  its  theoretical  results  relevant  here.  In  Section  4,  we  introduce  Antitonic 
Logic  Programs,  by  means  of  a  transformation  into  Monotonic  programs,  and 
define  their  semantics,  showing  they  enjoy  the  properties  consonant  with  a  logic 
programming  approach.  We  end  by  pointing  to  future  work. 

2  Logic  Progreunming  with  Costs 

In  [17],  rules  of  definite  logic  programs  are  assigned  non- negative  real  numbers, 
ascribing  to  every  rule  in  the  program  the  cost  of  applying  it.  The  authors 
define  three  interpretations  of  cost.  The  non-reusability  approach,  which  we  will 
discuss  in  this  work,  assumes  that  each  individual  conclusion  about  an  atom  in 
the  program  involves  a  cost.  In  the  reusability  approach,  we  only  pay  once  for 
concluding  about  an  atom,  the  first  time  around.  Costs  can  also  be  interpreted 
as  time  and  in  this  case  the  semantics  proposed  is  isomorphic  to  van  Emden’s 
quantitative  logic  programs  [22].  Let  us  illustrate  the  several  approaches  with  an 
example; 

Example  1.  Consider  the  weighted  logic  program; 

a<^b,c,c,d  b<^c  c4-  c^b 

In  the  non-reusability  approach,  we  spend  a  minimum  of  3  units  to  conclude  c 
and  4  to  conclude  6.  The  minimum  cost  for  a  is  therefore  4  +  4-1-3-1-3-1-6  =  20. 

If  we  adopt  the  reusability  interpretation,  we  don’t  need  to  pay  at  all  for  the 
use  of  c  at  the  rule  for  a,  since  the  cost  for  c  has  been  paid  to  conclude  b  with 
cost  4.  Thus,  the  cost  of  atom  a  is  4  +  4  +  6  =  14. 

Now,  for  the  time  interpretation,  we  assume  that  the  atoms  in  the  body  can 
be  proved  in  parallel.  Therefore  the  cost  of  applying  the  rule  is  the  weight  of  the 
rule  plus  the  maximum  time  to  derive  either  atom  in  the  body  and  there  is  no 
reusability.  So,  the  derivation  of  a  takes  4  +  max(4, 3, 3, 6)  =  10. 

In  [17]  the  authors  define  a  model  and  fixpoint  theory  for  logic  programs  with 
costs.  We  will  employ  them  to  illustrate  the  power  of  monotonic  logic  programs, 
and  attain  the  same  results.  In  the  rest  of  the  work  we  shall  focus  on  much  more 
general  logic  programs  with  costs  under  the  non-reusability  approach. 

3  Monotonic  Logic  Programs 

The  theoretical  foundations  of  logic  programming  were  clearly  established  in 
[15,21]  for  definite  logic  programs  (see  also  [16]),  i.e.  programs  made  up  of  rules 
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of  the  form  Aq  C  Ai  A  ...  A  An{n  >  0)  where  each  Ai(0  <  z  <  n)  is  a  propo¬ 
sitional  symbol  (an  atom),  C  is  classical  implication,  and  A  the  usual  Boolean 
conjunction. 

In  this  section  we  generalize  the  language  of  definite  logic  programs  in  order 
to  encompass  more  complex  bodies  and  heads  and,  evidently,  many- valued  logics. 
For  simplicity,  we  consider  only  the  propositional  (ground)  case.  Furthermore, 
we  define  a  model  and  a  fixpoint  theory  for  Monotonic  Logic  Programs,  and 
extend  to  them  the  classical  results  of  logic  programming.  The  important  point 
to  realize  is  that  all  the  fundamental  results  of  logic  programming  depend  only 
on  the  monotonicity  of  the  body  of  the  rule  and  on  the  fact  that  it  is  possible  to 
determine  the  truth- value  of  the  proposition  in  the  head  from  the  truth- value  of 
the  rule  body.  Our  underlying  set  of  truth- values  will  form  a  complete  lattice  and 
the  main  idea  is  that  every  implication  should  evaluate  to  T,  the  top  element  in 
the  lattice. 

When  defining  a  (new)  logic  it  is  necessary  to  address  the  two  distinct  but 
related  aspects:  the  syntax  and  the  corresponding  interpretation  of  the  logical 
symbols  in  the  language.  In  this  paper  we  adopt  an  algebraic  characterization 
of  both  the  language  and  interpretation  of  operators.  This  is  a  very  general  and 
powerful  framework,  allowing  for  a  simple  relation  between  the  two.  For  lack  of 
space,  we  reduce  the  presentation  to  the  essentials.  For  more  details  consult  for 
instance  [10]. 

The  main  assumptions  of  the  paper  are  collected  in  the  next  two  definitions. 

Definition  1  (Implication  Algebra).  Let  T=<  T,  :^>  be  a  complete  lattice 
and  consider  an  algebra  21  on  the  carrier  set  T.  We  say  that  21  is  an  implication 
algebra  with  respect  to  T  iff  it  has  defined  an  operator  <— a  on  21  such  that 

Vai,a2€a:  (<^1  ^2)  =  T  iff  a2  :<ai  (1) 

where  T  is  the  top  element  of%. 

Example  2.  For  logic  programs  with  costs  we  use  as  set  of  truth-values  the  in¬ 
terval  of  reals  greater  or  equal  than  0,  extended  with  infinity  ([0, 00]),  ordered 
by  the  “greater  or  equal  than”  relation.  Thus,  our  bottom  element  is  00  and  the 
top  element  is  of  course  0.  Thus  the  least  upper  bound  in  this  complete  lattice 
is  the  infimum.  We  designate  the  complete  lattice  <  [00, 0],>>  by^  'RA'.  The 
implication  symbol  is  defined  thus: 

{0  if  r2  >  ri,i.e.  r\  <  r2 
00  otherwise 

This  defines  an  implication  algebra  C  =  ([oo,0],  g^). 

Notice  that  some  many- valued  logics  have  implication  connectives  which  do  not 
comply  with  property  (1).  We  refer  the  reader  to  [4]  for  more  details. 


^  Note  that  00  in  [00, 0]  is  -f  00  and  not  —00. 
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Our  Monotonic  Logic  Programs  will  be  constructed  from  the  abstract  syntax 
induced  by  an  implication  algebra  and  a  set  of  propositional  symbols.  The  way 
syntax  and  semantics  relate  in  such  an  algebraic  setting  is  well-known  and  we 
defer  again  to  [10]  for  more  details. 

Definition  2  (Monotonic  Logic  Programs).  Let  2t  be  an  implication  algebra 
with  respect  to  a  complete  lattice  %.  Let  II  be  a  set  of  propositional  symbols  and 
FORM^{n)  the  corresponding  algebra  of  formulae  freely  generated  from  11  and 
the  “symbols”  of  operators  in  21.  A  monotonic  logic  program  is  a  set  of  ruleP  of 
the  form  such  that: 

1.  The  rule  {A<^^)  is  a  formula  of  FORM^{n); 

2.  The  head  of  the  rule  A  is  a  propositional  symbol  of  11. 

3.  The  body  formula  ^  with  propositional  symbols  (n  >  0)  corre¬ 

sponds  to  an  isotonic  function  having  those  symbols  as  arguments. 

As  usual,  we  shall  represent  binary  connectives  in  infix  notation. 

A  rule  of  a  monotonic  logic  program  expresses  a  (monotonic)  computation 
method  of  the  truth-value  of  the  head  propositional  symbol  from  the  truth- 
values  of  the  symbols  in  the  body.  The  monotonicity  of  the  rule  is  guaranteed 
by  the  isotonicity  of  the  function  corresponding  to  formula  I?':  if  an  argument 
of  W  is  monotonically  increased  then  the  truth-value  of  ^  also  monotonically 
increases.  Notice  that  the  bodies  of  rules  can  be  formed  from  any  operators  in 
the  implication  algebra,  besides  the  implication  connective. 

Example  3.  Every  rule  A^  Bi,...  Bm  of  a  weighted  logic  program  is  translated 
to  the  Monotonic  Logic  Program  rule  A  <—  -h. .  over  the  implication 

algebra  extended  with  the  addition  operation^  and  with  constant  symbols  for 
every  element  of  Rfi' .  The  program  of  Example  1  is  translated  to: 

a  4 -I- 6 -f- c -h  c -h  d  6^5  6<— l*fc  c  6  c^3  d  6 

Notice  that  if  we  decrease  (under  the  ordinary  ordering  of  real  numbers)  the 
value  of  a  propositional  symbol  then  the  value  of  the  sum  also  decreases;  thus 
the  bodies  of  the  above  rules  are  isotonic  with  respect  to  IV'. 

An  interpretation  is  simply  an  assignment  of  truth- value  to  each  propositional 
symbol  in  the  language.  We  assume  in  the  rest  of  this  section  an  implication 
algebra  2t  with  respect  to  a  complete  lattice  T  =<  T,:<>.  The  operator  and 
implication  symbol  is  denoted  by  .  Consider  also  that  a  set  11  of  propositional 
symbols  is  given  as  well  as  the  corresponding  algebra  FORM^{n)  of  formulae 
over  n.  Then  the  notion  of  interpretation  is  straightforward: 

^  We  represent  the  implication  operator  •(— «  at  the  syntactic  level  by  The  same 
convention  applies  to  any  other  operator  defined  in  the  implication  algebra  21. 

^  The  sum  of  oo  with  any  other  element  of  renders  oo. 
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Definition  3  (Interpretation).  An  interpretation  is  a  mapping  I  :  11 
T.  By  the  unique  homomorphic  extension  theorem,  the  interpretation  extends 
uniquely  to  a  valuation  function  I  :  FORM^{n)  — >  T.  The  set  of  all  interpre¬ 
tations  with  respect  to  the  implication  algebra  is  denoted  by  Xa. 

The  unique  homomorphic  extension  theorem  guarantees  that  for  every  in¬ 
terpretation  of  propositional  symbols  there  is  an  unique  associated  valuation 
function.  The  ordering  <  on  the  truth- values  in  T  is  extended  to  the  set  of 
interpretations  as  follows: 

Definition  4  (Lattice  of  interpretations).  Consider!^  the  set  of  all  inter¬ 
pretations  with  respect  to  implication  algebra  and  two  interpretations  ^ 
Xa-  Then,  <  Xa,  E>  is  a  complete  lattice  where  h  C  I2  iff  ^ pen  h{p)  f2(p)- 
The  least  interpretation  A  maps  every  propositional  symbol  to  the  least  element 
of  T,  and  the  greatest  interpretation  V  maps  every  propositional  symbol  to  the 
top  element  of  the  complete  lattice  of  truth-values  T. 

A  rule  of  a  monotonic  logic  program  is  satisfied  whenever  the  truth-value 
of  the  rule  is  T.  A  model  is  an  interpretation  which  satisfies  every  rule  in  the 
program.  Formally: 

Definition  5  (Satisfaction  and  Model).  Consider  an  interpretation  /  €  Xa. 
A  monotonic  logic  program  rule  A  is  satisfied  by  I  iff  I  {{A  <—^))  =  T. 

An  interpretation  I  is  a  model  of  a  monotonic  logic  program  P  iff  all 
rules  in  P  are  satisfied  by  I. 

We  proceed  by  showing  that  every  monotonic  logic  program  has  a  least  model 
which  is  the  least  fixpoint  of  a  monotonic  operator,  along  with  other  standard 
logic  programming  results.  One  such  result  is  the  immediate  consequences  op¬ 
erator,  extending  the  results  of  van  Emden  and  Kowalski  [21]  to  the  general 
theoretical  setting  of  implication  algebras: 

Definition  6  (Immediate  consequences  operator).  Let  P  be  a  monotonic 
logic  program.  Define  the  immediate  consequences  operator  Tp  :  Xa  —^X^,  map¬ 
ping  interpretations  to  interpretations,  where  A  is  a  propositional  symbol: 

T^{I){A)  =  lub  |/(!?)  such  that  A  <-  6  p} 

The  immediate  consequences  operator  evaluates  the  body  of  every  rule  for  a 
propositional  symbol  A.  The  truth-value  of  A  is  simply  the  least  upper  bound 
of  the  truth- values  of  all  the  bodies  of  the  rules  for  it. 

Example  4-  For  weighted  logic  programs  the  immediate  consequences  operator 
reduces  to: 

T^{I){A)  =  lub>  {s  +  +  . . .  +  I{Bm)  \A^s  +  Bi  +  ...-^BmeP} 

=  inf  {s-\-I{Bi)  +  ...-{^I{Bm)\A^s  +  Bi-\-...  +  BmeP} 

This  is  very  similar  to  the  Up  operator  of  [17]. 
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A  fundamental  property  of  operator  Tp  is: 

Theorem  1  (Monotonicity  of  the  immediate  consequences  operator). 

Let  Ii  and  I2  be  two  interpretations  in  Za,  and  P  a  monotonic  logic  program. 
Operator  is  monotonic,  i.e.  if  h  C  I2  then  Tf{Ii)  C  Z^(/2). 

As  usual,  the  set  of  models  of  P  is  characterized  by  the  post-fixpoints  of  Tp  : 

Theorem  2.  An  interpretation  I  of  Za  is  a  model  of  a  monotonic  logic  pro¬ 
gram  P  iffTp(I)  C  I.  Thus,  the  least  fixpoint  ofTp  is  the  least  model  of  P. 

Therefore,  the  semantics  of  a  monotonic  logic  program  is  given  by  Mp^  the 
least  model  of  P.  One  can  obtain  the  least  model  of  a  program  by  transfinite 
iteration  of  the  immediate  consequences  operator,  as  stated  in  the  next  theorem: 

Theorem  3  (Fixpoint  Semantics).  Let  P  be  a  monotonic  logic  program,  and 
consider  the  transfinite  sequence  of  interpretations  ofX^: 

=  A 

_  x^{Tp^^),  if  n-\- 1  is  a  successor  ordinal 
Zpl*^  =  U/3<a^pT^»  if  OL  is  a  limit  ordinal 

Then,  there  is  an  ordinal  X  such  that  Tf  r|f  and  the  least  model  of  P 

is  Mp  =  Zf 

The  major  difference  from  standard  classical  logic  programming  is  that  our 
Tp  operator  might  not  be  continuous,  and  therefore  more  than  uj  iterations  may 
be  necessary  to  “reach”  the  least  fixpoint.  All  the  other  important  results  carry 
over  to  our  general  framework.  This  possibility  is  unavoidable  if  one  wants  to 
retain  generality.  For  the  study  of  sufficient  conditions  to  guarantee  the  continu¬ 
ity  of  the  Tp,  see  [18].  We  now  determine  the  least  model  of  the  weighted  logic 
program  of  Example  1: 

Example  5.  The  least  interpretation  over  maps  every  literal  to  00.  Continuing 
with  Example  4,  the  computation  proceeds  as  follows: 

_ abed 

Tpt^  =  00  00  00  00 

=  00  5  3  6 
=  21  4  3  6 
--  20  4  3  6 
=  20  4  3  6 

4  Antitonic  Logic  Programs 

In  the  preceding  section  we  reviewed  the  framework  of  Monotonic  Logic  Pro¬ 
grams.  Now  we  extend  the  syntax  of  programs  allowing  for  rules  with  antitonic 
bodies  using  the  techniques  which  have  been  developed  in  logic  programming 
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theory.  Our  attained  aim  is  to  define  well-founded  [11]  and  stable  model  seman¬ 
tics  [12]  for  logic  programming  over  arbitrary  implication  algebras.  Thus,  we  can 
easily  extend  all  the  semantics  for  which  we  have  an  embedding  into  Monotonic 
Logic  Programs  with  non-monotonic  “negations.”  We  apply  these  new  results  to 
logic  programs  with  costs,  answering  some  of  the  questions  raised  in  [17]. 

Let  us  start  with  the  language  of  Antitonic  Logic  Programs: 

Definition  7  (Antitonic  Logic  Programs).  Let  21  he  an  implication  algebra 
with  respect  to  a  complete  lattice  T.  Let  11  be  a  set  of  propositional  symbols 
and  FORM^{n)  the  corresponding  algebra  of  formulae  freely  generated  from 
n  and  the  “symbols”  of  operators  in  21.  An  antitonic  logic  program  is  a  pair 
<  P+,  P~  >  where  P+  and  P~  are  sets  of  rules  of  the  form  A  such  that: 

1,  The  rule  {A  is  a  formula  of  FORM^{II); 

2.  The  head  of  the  rule  A  is  a  propositional  symbol  of  TI. 

If  A  ^  ^  belongs  to  P+,  then  the  body  formula  with  propositional  sym¬ 
bols  Pi,...,Pn  (n  >  0)  corresponds  to  an  isotonic  function  having  those 
symbols  as  arguments. 

4.  If  A  ^  ^  belongs  to  P~,  then  the  body  formula  ^  with  propositional  sym¬ 
bols  Pi,...,Pn  ^  0)  corresponds  to  an  antitonic  function  having  those 

symbols  as  arguments. 

We  denote  a  rule  o/P+  by  A  ^  and  a  rule  of  P~  by  A  < — 

We  recall  that  a  function  is  isotonic  (antitonic)  iff  the  value  of  the  function 
increases  (decreases)  when  we  increase  an  argument  while  the  other  remaining 
arguments  are  kept  fixed.  Note  that  we  do  not  introduce  explicitly  a  negation 
symbol  in  the  language.  A  negation  can  be  obtained  by  associating  an  antitonic 
function  with  an  ordinary  propositional  symbol,  as  shown  in  the  next  Example. 

Example  6.  For  weighted  logic  programs  we  can  define  a  natural  class  of  nega¬ 
tions  by  considering  the  cost  function  d  —  x^  onto  P+,  which  returns  0  if  a:  >  t?, 
otherwise  its  value  is  -  a:.  The  special  case  of  00  -  a;  returns  0  whenever  a;  =  00, 
else  it  evaluates  to  00.  The  interpretation  of  this  last  case  is  immediate:  if  an 
atom  cannot  be  concluded  from  the  program  its  cost  is  00  and  therefore  its  nega¬ 
tion  has  cost  0.  Otherwise,  the  negation  does  not  hold  and  therefore  the  cost  of 
00  —  a;  is  00.  An  example  of  an  antitonic  logic  program  is: 

a<^Z  +  b  +  not  A.  (>4^-2.  d‘^A  +  e.  e<^l  +  not-a. 

notjx  < —  1  —  a  not^d  < —  h  —  d 

The  symbols  ^not-a^  and  ^notjT  are  new  propositional  symbols. 

We  proceed  by  first  defining  a  well-founded  like  semantics  for  antitonic  logic 
programs,  guaranteeing  the  existence  of  a  model  for  every  program.  Afterwards, 
we  deal  with  stable  models,  which  are  simpler  to  introduce.  Again,  we  assume  an 
implication  algebra  2t  with  respect  to  a  complete  lattice  T  —<  T,  Consider 
too  that  a  set  il  of  propositional  symbols  is  given  as  well  as  the  correspond¬ 
ing  algebra  FORM%{n)  of  formulae  over  11.  To  start  with  a  new  notion  of 
interpretation  is  required: 
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Definition  8  (Partial  Intepretations).  A  partial  interpretation  is  a  pair  of 
interpretations  <  >.  The  set  of  all  partial  interpretations  is 

The  P  component  contains  what  is  “true”  in  the  interpretation,  while 
what  is  non-false”,  i.e.  true  or  undefined.  It  is  important  to  mark  that  we  do 
not  impose  consistency  of  the  interpretation  (i.e.  P  C  P^)  and  thus  allow  for 
paraconsistency.  Two  orders  among  partial  interpretations  are  useful: 

Definition  9  (Truth  and  knowledge  ordering).  Let  Ii  and  I2  be  two  partial 
interpretations.  The  truth  and  knowledge  orders  among  partial  interpretations 
are  defined  by: 

Truth  ordering;  p  iff  C  and  7}^  □  7|“. 

Knowledge  ordering:  7i  h  iff  C  7|  and  7|“  C  7{^. 

The  set  of  partial  interpretations  ordered  by  Qt  or  is  a  complete  lattice. 
Clearly,  the  bottom  and  top  elements  of  these  lattices  are  =<A,  A>,  T*  =< 
V,  V  >_,  i.fc  =<A,  V  >  and  Tk  =<  V,  A>. 

Given  a  partial  interpretation,  we  can  define  the  model  of  an  antitonic  logic 
program: 

Definition  10.  Let  P  =<  p+^p~  >  antitonic  logic  program.  A  partial 

interpretation  I  =<  P,  P^  >  satisfies  a  rule  A<~^ofP  iff: 

—  P{A)  =  T,  if  the  rule  belongs  to  P"*". 

-  I^{A)  =  T,  if  the  rule  belongs  to  P~. 

We  say  that  I  is  a  model  of  P  iff  the  interpretation  satisfies  all  the  rules  of  P. 

Notice  that  the  heads  of  rules  are  always  evaluated  with  respect  to  7*  while 
the  truth- values  of  bodies  are  determined  from  P  (7*“)  for  isotonic  (antitonic) 
rules.  The  attentive  reader  will  have  noticed  that  with  the  above  definition  our 
programs  define  only  positive  information  from  the  bodies.  Because  of  the  an¬ 
titonic  character  of  rules  in  P“,  the  greater  the  body  arguments  the  lesser  is 
P^{^)  and  the  lesser  is  P{A).  Clearly,  our  framework  captures  normal  logic 
programs:  every  rule  of  the  form  A  Bi, . . . ,  Bm,  notCi, . . . ,  notC^  is  trans¬ 
lated  into  A  Bi  A  . . .  A  Bm  A  not-Ci  A  ...  A  not.Cn  and  a  further  rule  exists 
for  each  new  atom  not.Ci  of  the  form  not_Ci  ^  1  -  Q.  The  underlying  lattice 
of  truth-values  is  {0, 1}  with  0  <  1. 

A  natural  generalization  is  to  include  rules  which  declare  directly  the  truth- 
values  of  P^.  Here  we  enter  the  arena  of  extending  logic  programming  and 
general  logic  programming,  where  rules  with  explicit  and  default  negation  in 
their  heads  are  permitted,  which  we  will  deal  with  in  subsequent  work. 

In  order  to  define  the  semantics  of  Antitonic  Logic  Programs  we  will  resort 
to  a  Gelfond-Lifschitz  like  division  operator  [12],  which  transforms  an  Antitonic 
Logic  Program  into  a  Monotonic  one,  for  which  we  know  how  to  compute  the 
corresponding  least  model.  This  is  the  usual  technique  for  defining  well-founded 
and  stable  model  semantics  for  normal  logic  programs.  Our  definition  and  results 
orbit  around  this  notion. 
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Definition  11  (Program  Division).  Consider  an  antitonic  logic  program  P 
and  an  interpretation"^  The  division  of  program  P  by  is  the  monotonic 
logic  program: 

^  =  P+  U  |j4  such  that  eP~  and'd  = 

We  assume  that  every  truth-value  in  %  has  a  corresponding  constant  operator  in 
21  which  returns  its  value. 

Example  7.  Consider  the  program  of  Example  6  and  interpretation  I  mapping  a, 
not-a,  6,  d,  notJi  and  /  to  oo,  0,  2,  3,  0  and  1,  respectively.  The  division  of  that 
program  by  /  is: 

a  3  +  ft  +  notjd  2  4  +  e  e-(— 1+  not-a 

notja  •<—  0  not-d  <—  2 

Thus,  the  program  division  operation  substitutes  the  bodies  of  antitonic  rules 
by  their  truth- value  in  the  given  interpretation.  The  result  is  a  monotonic  logic 
program.  Now  we  can  define  an  auxiliary  operator  mapping  partial  interpreta¬ 
tions  into  interpretations,  which  will  be  extensively  used: 

Definition  12.  Consider  an  antitonic  logic  program  P  and  a  partial  interpre¬ 
tation  I  =<  >.  Then,  Cf{I)  =  T\  {P). 

The  Cp{I)  operator  determines  what  follows  immediately  from  the  program 
given  the  partial  interpretation.  We  have  the  following  important  results: 

Proposition  1.  A  partial  interpretation  I  is  a  model  of  an  antitonic  logic  pro¬ 
gram  P  iffCp{I)  E  P‘  Furthermore,  let  J  he  another  partial  interpretation  such 
that  I  QkJ  then  C^{I)  C 

The  last  proposition  is  simply  assuring  that  if  we  have  more  knowledge  (less 
undefined  values  in  the  interpretation)  then  we  can  extract  more  information 
from  our  programs. 

Operator  Cp  maps  partial  interpretations  to  interpretations.  We  now  define  a 
new  operator  which  maps  partial  interpretations  to  partial  interpretations,  given 
what  is  known  to  be  true  and  to  be  false,  in  the  same  spirit  of  Przymusinski’s  O 
operator  [19]: 

Definition  13  (Partial  Consequences  Operator).  Let  P  he  an  antitonic 
logic  program,  and  the  two  partial  interpretations  I  and  J.  The  partial  conse¬ 
quences  operator  is  given  by  the  equation: 

efip  J)  =  (C?  (<  P,  >) ,  C?  (<  P^,  P  >)) 

We  usually  omit  the  subscript  21  and  denote  Op{I,  J)  by  Op{I). 


^  Mark  well  this  is  an  interpretation,  not  a  partial  interpretation! 
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Interpretation  J  represents  safe  knowledge.  On  the  one  hand,  from  one 
can  conclude  what  definitely  doesn’t  hold,  and  thus  we  use  this  interpretation 
for  deriving  what  holds  surely.  On  the  other  hand,  from  one  knows  what  is 
true,  and  so  we  use  it  for  deriving  what  may  possibly  hold. 

Proposition  2.  Let  I,  J  and  K  he  partial  interpretations.  If  I  Qt  J  then 

Basically  we  have  shown  in  the  previous  proposition  that  Op  is  a  monotonic 
operator  on  the  lattice  of  partial  interpretations  according  to  the  truth-ordering. 
Its  least  fixpoint  is  guaranteed  to  exist  and  is  important  for  our  objectives: 

Definition  14  (Full  Consequences  Operator).  Let  P  be  an  antitonic  logic 
program  and  J  a  partial  interpretation.  Define  f2p(J)  =  Ifp  0p.  Alternatively, 
Op{J)  is  given  by  Op]^  where  A  is  the  least  ordinal  for  which  6>pt'^+^  =  Op^^ 
with 


^  <A,  A> 

n  +  1  is  a  successor  ordinal 
=  Ut  {^pT^  such  that  <  a}  ,  if  a  is  a  limit  ordinal 

By  an  easy  transfinite  induction  proof  one  concludes: 

Proposition  3.  Let  P  he  an  antitonic  logic  program  and  J  an  arbitrary  partial 
interpretation  then: 

©^T“  =  (r^T“,r|.T“) 

So,  the  least  fixpoint  of  Op  is  in  fact  obtained  from  the  least  model  of  mono¬ 
tonic  logic  programs  and  This  deserves  a  definition  and  a  theorem: 

Definition  15  (Gamma  operator).  Let  P  be  an  antitonic  logic  program  and  J 
an  interpretation.  Define 

PpiJ)  —  Mp  —  Ifp  Tp  =  Tp  t^,  for  some  ordinal  A 

J  J 

Theorem  4.  Let  P  be  an  antitonic  logic  program  and  J  a  partial  interpretation. 
Then, 

of{j)  =  (r^(j*“),r^(j*)) 

Thus,  we  have  defined  the  full  consequences  operator  in  terms  of  the  least 
model  of  two  monotonic  logic  programs.  Basically,  Theorem  4  states  that  the 
least  model  of  ^  gives  what  is  true  given  the  safe  knowledge  J,  and  that  the 
least  model  of  provides  what  is  non-false. 

A  well-known  result  of  logic  programming  theory  is  that  operator  P  is  anti¬ 
monotonic.  The  same  happens  with  antitonic  logic  programs: 

Proposition  4.  Consider  the  antitonic  logic  program  P.  Let  I  and  J  be  two 
interpretations  such  that  I  QJ,  then  r^{J)  C  Ppil). 
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Prom  this  there  follows  a  fundamental  result: 

Theorem  5  (Monotonicity  of  the  full  consequences  operator).  Let  P 

be  an  antitonic  logic  program,  and  I  ^  partial  interpretations.  Then, 
Of  (I)  Qk  Of(J). 

We  conclude  immediately,  again  by  the  Knaster-Tarski  fixpoint  theorem,  that 
Of  has  a  least  fixpoint  under  the  knowledge  ordering  of  partial  interpretations. 

Definition  16  (Well-founded  Semantics).  Consider  an  antitonic  logic  pro¬ 
gram  P.  The  partial  stable  models  ofP  are  thefixpoints  of  operator  Of.  The  least 
one  under  the  knowledge  ordering  of  partial  interpretations  is  the  well-founded 
model  WFMp,  and  can  be  obtained  by  transfinitely  iterating  the  Of  operator 
from  the  least  partial  interpretation  =<A,  V  >. 

Given  a  partial  stable  model  M  we  say  that  an  atom  A  is: 

—  true  with  degree 'd  wrt.  M  if  d  :<  M^{A)  and  d  :<  M^'^(A). 

—  undefined  with  degree  d  wrt.  M  if  d  ^  M^{A)  and  d  ^  M^^{A). 

—  false  with  degree  d  wrt.  M  if  d  ^  M^{A)  and  d  ^  M^^{A). 

—  inconsistent  with  degree  d  wrt.  M  if  d  M^{A)  and  d  ^  M^^{A). 

By  resorting  to  the  definition  of  Of  in  terms  of  Pp  we  can  present  an  al¬ 
ternating  fixpoint  definition  [1]  of  the  well-founded  semantics  for  antitonic  logic 
programs: 

Theorem  6.  Consider  an  antitonic  logic  program  P.  The  partial  interpreta¬ 
tion  I  is  a  fixpoint  of  Of  iff  r^{rf{P))  =  P  and  =  P^iP). 

Example  8.  The  well-founded  model  in  Example  6  is  determined  as  follows: 

a  not-a  b  d  not-d  e 

7q  =  00  00  00  00  OO  00 

r|(/o)  =  5  0  2  5  0  1 

7i  =  r|(r|(/o))  =  5  2  2  7  0  3 

r|(7i)  =  5  2  2  7  0  3 

The  well-founded  model  of  P  is  <  /i,r|(/i)  >=<  hJi  >,  and  thus  we  con¬ 
clude  a  with  cost  5.  The  example  illustrates  a  possible  interpretation  of  negation 
in  logic  programs  with  costs:  what  costs  more  to  prove  true  costs  less  to  show 
false,  and  vice-versa. 

Thus,  the  well-founded  model  of  P  is  given  by  the  least  fixpoint  of  the  (Tp ) 
operator,  a  well-known  result  in  the  theory  of  logic  programming  [1].  We  have 
the  next  reassuring  result: 

Theorem  7.  The  partial  stable  models  of  an  antitonic  logic  program  P  are  mod¬ 
els  of  P. 
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A  partial  stable  model  is  said  consistent  whenever  there  is  no  propositional 
symbol  A  which  is  inconsistent  to  any  degree  i?.  It  is  said  fully  defined  whenever 
there  is  no  propositional  symbol  with  undefined  truth- value  to  any  degree 
Clearly,  the  following  is  expected: 

Proposition  5.  A  partial  stable  model  M  of  antitonic  logic  program  P  is  con¬ 
sistent  iff  Af*  C  It  is  fully  defined  whenever  C  fAf . 

The  consistent  and  fully-defined  partial  stable  models  of  P  are  the  stable 
models  of  P: 

Definition  17  (Stable  Models).  Let  P  be  an  antitonic  logic  program.  We  say 
that  M  is  a  stable  model  of  P  iff  it  is  a  consistent  and  fully  defined  partial  stable 
model  of  P.  Furthermore,  M*  and  M  is  a  stable  model  iff  it  is  a  fixpoint 

ofr^,i.e.M^r^{M). 

The  program  of  Example  6  has  a  single  stable  model  which  coincides  with 
its  well-founded  model. 

Example  9.  The  program  containing  the  rules  a  ^  Hi  -  notM.  and  not-a  ^ 
H2  -  a.  has  a  single  stable  model  when  =  0  or  =  0.  Suppose  now  that 
both  are  finite  non-zero  real  numbers.  In  this  case  if  ^  H2  then  the  program 
has  no  stable  models.  Assume  that  Hi  =  H2  >  0.  All  interpretations  such  that 
a  -}-  not  ..a  =  Hi  are  stable  models  of  the  program,  therefore  there  exist  an  infinite 
number  of  them.  Finally,  if  both  =  H2  —  00  then  there  are  two  stable  models 
of  the  program.  If  =  00  and  1^2  7^  00  we  have  a  single  stable  model. 

The  analysis  of  the  program  comprising  the  single  rule  a  ^  -  a  is  easier. 

U  H  -  00  then  there  are  no  stable  models.  Otherwise,  the  single  stable  model 
assigns  the  truth-value  |  to  a. 

Example  10.  Consider  this  variant  of  Example  6: 

a  <  3  +  6  -f  not-d.  b  <  2.  d  ^  4  -f-  e.  e  1  -h  notja. 

not-a  •<  00  —  a  not-d  < —  00  —  d 

The  well-founded  model  and  its  two  stable  models  are  shown  below  : 

_ _  Q  not-a  b  d  not-d  e  a  not-a  b  d  not-d  e 

WFM^  =  00  00  2  00  00  00  Ml  =:  00  0  2~5  00  F 

WFM^^  =  5  025  0  1  M2=:5oo2oo0oo 

The  positive  part  of  the  well-founded  model  is  not  very  informative,  except  that 
we  can  conclude  h  with  cost  2.  However,  the  non-false  part  provides  the  joint 
limits  of  the  stable  models.  In  general,  WFM^  C  UiMi  and  U^Mi  C  WFM^^. 

The  interpretation  of  the  negation  00  —  A  in  stable  model  semantics  should 
now  be  clear.  Given  an  atom  that  we  can  prove  then  its  negation  is  not  provable 
and  it  therefore  costs  00  to  prove.  Also,  by  simply  looking  at  the  model,  one 
can  see  what  does  not  hold,  and  therefore  know  instantly  that  its  negation  holds 
without  any  extra  effort,  justifying  its  zero  cost.  We  believe  this  is  in  accordance 
with  the  ideas  advanced  in  the  conclusions  section  of  [17],  as  a  possible  interpre¬ 
tation  for  negation  in  logic  programs  with  costs.  Obviously,  any  other  antitonic 
negation  can  be  easily  encoded  in  our  framework. 
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5  Conclusions  and  Future  Work 

In  a  single  sentence,  this  paper  shows  how  to  (logically)  program  with  arbitrary 
monotonic  and  anti-monotonic  (antitonic)  operators  over  a  complete  lattice.  This 
is  a  simple  and  powerful  idea  with  an  enormous  range  of  applications.  In  par¬ 
ticular,  the  present  work  paves  the  way  to  combine  and  integrate  several  forms 
of  reasoning  into  a  single  framework,  namely  fuzzy,  probabilistic,  uncertain,  and 
paraconsistent  ones. 

This  paper  is  the  natural  extension  of  the  seminal  works  by  Subrahma- 
nian  [20,14]  and  Fitting  [8,9].  However,  these  authors  stick  to  a  logic  program¬ 
ming  syntax  instead  of  considering  arbitrary  monotonic  and  antitonic  functions 
in  the  bodies.  This  is  a  major  contribution  of  our  work.  To  be  absolutely  fair,  the 
article  [9]  introduces  the  notions  of  attenuation  operators  which  can  be  viewed 
as  arbitrary  monotonic  functions  over  bilattices. 

Quite  recently,  there  appeared  in  the  literature  a  profound  work  [6]  entitled 
“Approximations,  Stable  Operators,  Well-founded  Fixpoints  and  Applications  in 
Nonmonotonic  Reasoning” .  The  stance  there  is  to  depart  from  either  a  monotone 
or  antitonic  operator  and  define  approximations  to  it,  via  a  stable  operator  which 
corresponds  to  an  abstract  version  of  the  well-founded  semantics.  The  interesting 
case  is  when  O  is  antitonic.  In  this  situation,  most  of  the  results  in  [6]  are  obtained 

in  our  scheme  by  considering  the  single  rule  program  A  0(A).  It  is  also  true 
that  our  results  follow  immediately  from  [6].  So,  both  frameworks  are  capable 
of  expressing  each  other.  However,  we  show  how  multiple  operators  should  be 
combined  in  order  to  extend  the  widely  accepted  well-founded  and  stable  model 
semantics  for  normal  logic  programs,  a  point  not  addressed  in  [6]. 

Regarding  future  work,  we  expect  to  show  the  embeddings  of  other  cost 
functions  introduced  in  [17],  as  well  as  those  of  [22].  Extension  of  our  antitonic 
results  to  residuated  lattices  (c.f.  [2,13]),  where  a  generalized  modus  ponens  rule 
is  defined,  is  also  foreseen.  We  also  aim  to  provide  a  proof  theory  for  (instances) 
of  our  framework. 
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1  General  Information 

The  Asystem  [4]  is  a  new  system  for  performing  abductive  reasoning  within 
the  framework  of  Abductive  Logic  Programming  (ALP).  The  principles  behind 
the  system  are  founded  by  work  on  two  earlier  systems  ACLP  [2,3]  and  SLD- 
NFA(C)  [1,6].  The  basic  inference  mechanism  of  the  system  combines  abductive 
logic  programming  and  constraint  logic  programming.  In  its  computation  it  re¬ 
duces  the  high  level  specification  of  the  problem  and  goal  at  hand  to  a  lower 
level  constraint  store.  This  constraint  store  is  managed  by  an  efficient  constraint 
solver  returning  information  to  the  abductive  reduction  process  in  order  to  help 
this  in  its  search  for  a  solution. 

In  the  development  of  the  ^-system  particular  attention  was  put  into  general 
purpose  control  and  search  strategies  in  order  to  enhance  the  computational 
behaviour  of  the  system.  The  main  idea  is  to  suspend  each  commitment  to 
a  choice  up  to  the  moment  where  no  other  information  can  be  derived  in  a 
deterministic  way.  This  improves  it  over  its  ancestors. 

The  .A-system  is  implemented  as  a  meta  interpreter  on  top  of  Sicstus  Prolog 
where  at  least  version  3.8.5  is  needed.  The  system  is  therefore  available  on  each 
platform  on  which  Sicstus  is  available. 

2  Description  of  the  System 

The  A-system  is  a  declarative  problem  solving  environment.  It  allows  us  to  spec¬ 
ify  the  problem  domain  in  a  well  structured  way  in  terms  of  an  abductive  logic 
program  (ALP).  A  problem  specification  consists  of  two  parts  of  information: 
definitional  knowledge  and  assert ional  knowledge  or  integrity  constraints.  In  the 
definitional  part  predicates  are  defined  in  a  (constraint)  logic  program  by  spec¬ 
ifying  the  rules  embodying  all  the  information  known  of  these  predicates. 

In  a  planning  domain  example  the  predicate  holds_at  (P,T)  is  defined  as: 
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liolds_at(P,T):~  initially(P)  ,  not (init_clipped(P,T) )  . 
holds_at(P,T) initiates (P, A) ,  act(A,E),  E<T,not(clipped(E,P,T)) . 

The  predicates  which  are  not  defined  like  act(A,E)  are  called  open  or  abducible. 

Integrity  constraints  are  universal  quantified  statements  which  express  a 
property  of  the  theory  (application  domain)  that  must  remain  true  whenever 
this  theory  is  extended  with  a  (partial)  definition  of  the  open  predicates.  In 
practice,  this  means  that  the  integrity  constraints  encode  at  a  declarative  level 
properties  of  the  abductive  solutions  to  the  goals  of  our  problem  domain.  In  the 
v4-system  they  have  to  be  specified  as  denials  with  as  head  ic.  For  example  the 
two  constraints  below  ensure  that  the  second  argument  of  act  (A ,  T)  is  within 
the  valid  time  interval;  and  that  the  preconditions  of  each  action  is  satisfied. 

ic:-  act(A,T) , not (time (T)) . 

ic : -  act ( A , T) , not (pre.condit ions.hold (A , T) ) . 

Given  a  query  Q  the  >i-system  computes  an  extension  of  the  open  predicates 
such  that  this  set  entails  Q  and  the  integrity  constraints.  Semantically  this  so¬ 
lution  is  formalised  as  an  abductive  explanation  of  Q.  Computationally  this  is 
found  through  a  reduction  process  where  the  system  goes  through  a  cycle  of  se¬ 
lecting  a  choice  point,  making  this  choice  and  propagating  deterministicaly  this 
choice  as  much  as  possible.  In  general,  the  ^-system  postpones  choices  as  long 
as  possible  and  tries  to  evaluate  the  choice  points  in  an  informed  way.  This  is 
mainly  done  by  interacting  with  the  constraint  solver. 

3  Applying  the  System 

A  simple  preprocessor  accepts  the  ALP  problem  representation  and  compiles  it 
into  a  readable  format  for  the  system.  The  preprocessor  will  load  automatically 
the  compiled  specification.  The  Asystem  can  then  be  queried  by  the  call  asys- 
tem.solve(query(Args)).  The  system  returns  as  answer  a  table  of  atoms  of  the 
open  predicates  whose  addition  to  the  program  will  entail  the  query. 


3.1  Methodology 

The  Asystem  has  been  applied  on  a  number  of  applications.  Although  most  of 
them  are  not  of  ’’industrial  scale”  some  general  guidelines  have  emerged  from 
these  experiments,  which  can  be  followed  to  build  an  application. 

The  main  advantage  of  the  .4-system  is  the  modular  development  of  the 
problem  representation  that  it  allows.  This  stems  from  the  fact  that  this  repre¬ 
sentation  can  be  a  direct  mapping  of  the  declarative  high-level  specification  of 
the  problem.  The  ^-system  thus  allows  us  to  build  up  the  problem  representation 
incrementally.  Typically,  this  starts  with  a  choice  of  the  alphabet.  The  alphabet 
will  determine  the  way  the  specification  will  look  and  will  potentially  infiuence 
the  reasoning  efficiency  of  the  system.  For  example,  in  the  block  world  planning, 
in  the  simplest  case,  it  is  sufficient  to  capture  the  movement  of  a  block  firom  one 
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place  to  another  in  one  action  move(X,  Y,T) .  This  specification  cannot  represent 
a  state  in  which  the  robot  is  holding  a  block.  If  that  becomes  important  the 
move  action  can  be  split  into  pick(X,T)  and  put(X,Y,T).  This  refinenment  is 
done  locally  affecting  only  part  of  the  specification.  In  general,  such  refinements 
allow  the  user  to  solve  first  a  more  abstract  problem  whose  solutions  can  then 
be  further  reduced  to  give  concrete  actual  solution  of  the  problem  e.g.  as  in  the 
process  of  hierarchical  planning. 

Another  way  of  refining  the  declarative  problem  representation  in  ALP  is  to 
add  extra  integrity  constraints.  Due  to  this  extra  knowledge  the  .A-system  will 
(in  most  cases)  be  able  to  prune  unwanted  branches  earlier  or  eliminate  unin¬ 
teresting  solutions  from  the  more  general  specification.  The  compactness  of  the 
ALP  representations  also  means  that  the  framework  is  well  suited  for  problems 
in  which  the  specification  is  subject  to  (regular)  changes.  The  adaptation  to 
changes  can  be  done  easily  without  disturbing  the  whole  specification. 


3.2  Users  and  Useability 

A  user  familiar  with  logic  programming  or  first  order  logic  will  be  able  to  use  the 
w4“System  with  little  extra  help.  A  good  knowledge  about  the  problem  domain 
will  result  in  better  and  more  informed  specifications  and  thus  in  better  perfor¬ 
mance.  The  .A-system  also  allows  the  user  to  influence  its  behaviour  using  some 
parameters  on  how  this  would  interact  with  the  underlying  constraint  solver.  For 
this  reason  it  is  useful  to  be  familiar  with  some  details  of  CLP. 

The  modeling  language  poses  very  few  restrictions  on  the  problem  domains 
on  which  it  can  be  applied.  However,  the  current  implemented  prototype  has 
some.  At  this  moment  only  problems  which  can  be  modeled  inside  the  finite  do¬ 
mains  framework  can  be  solved  efficiently.  This  is  because  only  the  finite  domain 
CLP  solver  is  integrated  in  the  system.  In  principle  there  are  no  limitations  to 
integrating  CLP  solvers  over  other  domains.  Furthermore,  we  are  extending  the 
system  with  aggregates  (e.g.  cardinality,  sum,  and  average)  so  that  a  wider  range 
of  problem  domains  can  be  solved.  This  extension  will  allow  the  ^-system  to 
solve  for  example  problems  of  planning  with  resources. 


4  Evaluating  the  System 

The  .A-system  has  been  tested  on  constraint  satisfaction  problems  (CSP)  (e.g. 
N-Queens,  Graph  Coloring  and  Scheduling  problems),  examples  of  diagnosis  and 
standard  planning  problems  taken  from  the  AI  Planning  Systems  Competition 
2000.  Most  extensively  it  has  been  tested  on  planning  as  in  this  domain  the 
problems  cannot  be  reduced  in  a  deterministic  way  to  a  constraint  store  but  the 
system  must  search  for  a  solution.  In  simple  CSP  problems  like  the  n-queens  or 
graph  coloring  and  scheduling  problems  the  whole  specification  can  be  reduced 
to  a  CLP-constraint  store.  The  real  problem  solving  is  then  left  to  CLP  solver. 
However  in  the  planning  domain,  the  search  for  a  solution  is  an  interleaved 
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process  of  making  a  choice  by  the  high  level  procedure  of  the  -4-system  and 
utilizing  the  information  about  the  impact  of  this  choice  on  the  constraint  store. 

As  mentioned  above,  currently  only  a  finite  domain  solver  is  incorporated  in 
the  system,  and  thus  the  problem  size  is  limited  by  the  efiiciency  of  the  con¬ 
straint  solver  as  this  size  increases.  As  the  problem  size  increases  the  system, 
due  to  non  optimal  data  structures  that  it  uses,  may  slow  down  on  large  prob¬ 
lem  instantiations.  It  has  been  successfully  tested  in  the  blocks  world  domain 
(using  the  move  operator)  up  to  100  blocks.  On  other  planning  domains  like 
logistics  the  scaling  was  not  as  good.  In  general,  the  problem  size  which  can  be 
handled  depends  on  the  generated  constraint  store.  If  this  store  becomes  too 
complex  for  the  constraint  solver,  the  derivation  process  might  get  stuck  in  a 
local  satisfiability  check  of  this. 

4.1  Benchmarks  eind  Comparison 

Currently  very  few  benchmarks  are  available  for  testing  the  full  capabilities  of  the 
-4-system  with  the  AIPS  planning  competition  test  set  being  the  best  example. 
The  -4-system  has  been  evaluated  in  [4]  with  some  of  the  problems  from  this  set. 

A  comparison  of  the  -4-system  with  other  systems  can  be  separated  in  two 
classes:  to  other  general  purpose  systems  like  Smodels  and  DLV  or  more  spe¬ 
cialized  systems  like  AI  planners.  At  this  moment  no  extensive  study  has  been 
made  to  compare  the  -4-system  with  the  second  class  of  systems.  In  the  first 
class,  the  -4-system  is  capable  to  solve  the  same  type  of  applications.  However, 
Smodels  and  DLV  which  work  on  propositional  theories  are  more  robust  on 
some  problems  making  extensive  use  of  heuristics  in  their  bottom  up  compu¬ 
tation.  Currently,  we  are  experimenting  with  different  general  heuristics  that 
would  allow  the  -4-system  to  avoid  infinitely  growing  search  branches  and  thus 
make  it  more  robust,  A  number  of  recent  comparison  experiments  [5]  indicate 
that  the  top  down  reducing  process  of  the  -4-system  performs,  on  some  classes 
of  problems,  better  than  Smodels.  A  characteristic  for  such  a  class  is  that  the 
high-level  specification  can  be  deterministically  reduced  into  a  constraint  store. 
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Abstract.  In  recent  years,  several  approaches  for  dealing  with  updates 
of  logic  programs  have  been  proposed.  In  this  paper,  we  describe  the  sys¬ 
tem  upd,  an  implementation  of  the  update  formalism  due  to  Eiter  et  at 
This  method  is  based  on  a  compilation  technique  to  standard  answer  set 
semantics,  in  which  update  sequences  are  translated  into  single  logic  pro¬ 
grams,  and  which  allows  the  use  of  existing  logic  programming  systems 
as  underlying  reasoning  engine.  In  the  present  case,  upd  is  conceived  as 
a  front-end  to  the  state-of-the-art  solver  DLV.  Besides  the  basic  update 
semantics  of  Eiter  et  at ,  the  implementation  handles  also  refinements  of 
the  semantics  involving  certain  minimality-of-change  criteria. 


1  Background 

The  problem  of  updating  nonmonotonic  knowledge  bases  has  gained  increasing 
interest  in  recent  years.  In  particular,  several  update  approaches  have  been  pro¬ 
posed  in  which  knowledge  bases  are  represented  as  logic  programs  [1, 2,3,6, 7].  In 
this  paper,  we  present  the  system  upd,  which  is  an  implementation  of  the  method 
for  updating  logic  programs  due  to  Eiter  et  at  [2, 3]. This  approach  is  based  on 
the  answer  set  semantics  for  extended  logic  programs,  and,  like  related  update 
formalisms,  it  incorporates  new  information  into  the  current  knowledge  base  ac¬ 
cording  to  a  causal  rejection  principle.  This  principle  enforces  that,  in  case  of 
conflicts  between  rules,  more  recent  rules  have  precedence  over  older  rules.  The 
general  approach  can  be  described  as  follows. 

Given  a  sequence  (Pi, . . . ,  Pn)  of  extended  logic  programs,  each  P,  is  assumed 
to  update  the  information  expressed  by  the  initial  sequence  (Pi, . . . ,  Pi-i).  The 
sequence  (Pi, ... ,  Pn)  is  then  translated  into  a  single  logic  program  P',  respecting 
the  successive  update  information,  such  that  the  answer  sets  of  P'  represent  the 
“update  answer  sets”  of  (Pi, . . . ,  P^).  The  translation  is  realized  by  introducing 
new  atoms  which  control  the  applicability  of  rules  with  respect  to  the  given 
update  information.  Informally,  if  two  rules,  r  G  Pi  and  r'  G  Pj^  assert  conflicting 
information,  where  i  <  j,  then  the  more  recent  rule,  r',  is  applied,  whilst  r  is 
“rejected” .  From  a  technical  point  of  view,  this  rejection  principle  is  expressed 
in  terms  of  the  so-called  rejection  set^  rej{S,  P),  which  consists  of  all  rules  of  the 
given  update  sequence  P  =  (Pi, . . .  ,Pn)  which  are  rejected  on  the  basis  of  an 
update  answer  set  5. 
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A  property  which  this  basic  update  semantics  intuitively  does  not  respect  is 
minimality  of  change.  In  general,  however,  it  is  desirable  to  incorporate  a  new 
set  of  rules  into  an  existing  program  with  as  little  change  as  possible.  This  is 
realized  by  the  notions  of  minimal  and  strictly  minimal  update  answer  sets,  as 
introduced  in  [2,3].  Intuitively,  an  update  answer  set  S  is  minimal  iff  there  is  no 
update  answer  set  S'  of  the  update  sequence  P  —  (Pj, . . . ,  yielding  a  smaller 
rejection  set,  i.e.,  such  that  rej{S',P)  C  rej{S,P)  holds.  Strict  minimality  is  a 
somewhat  stronger  notion  taking  also  the  rules  rejected  at  specific  levels  into 
account.  More  specifically,  let  rej^{S,  P)  be  the  rejected  rules  contained  in  Pi 
(1  ^  ^  ^  then  the  update  answer  set  S  is  strictly  minimal  iff  there  is  no 
update  answer  set  S'  of  the  update  sequence  P  such  that  rej^iS',  P)  C  reji{S,  P) 
and  rejj{S',  P)  =  rejj{S,  P)  for  i  +  I  <  j  <  n. 

Generally  speaking,  the  implementation  upd  handles  the  following  reasoning 
tasks:  (i)  checking  the  existence  of  an  update  answer  set  for  a  given  update 
sequence,  (ii)  brave  reasoning,  and  (iii)  skeptical  reasoning.  Each  of  these  tasks 
is  realized  for  the  basic  update  semantics,  as  well  as  for  minimal  and  strictly 
minimal  update  answer  sets.  Furthermore,  the  tasks  are  defined  for  function- 
firee  (datalog)  programs,  utilizing  the  advanced  grounding  mechanism  of  DLV. 

2  System  Specifics 

2.1  General  Information 

Since  the  above  update  approach  is  based  on  a  compilation  technique  to  standard 
answer  set  semantics,  it  is  possible  to  build  an  implementation  using  an  existing 
logic  programming  system  as  underlying  reasoning  engine.  In  the  present  case, 
upd  is  realized  as  a  front-end  to  the  logic  programming  tool  DLV  [4,5],  which 
is  a  state-of-the-art  solver  for  disjunctive  logic  programs  under  the  answer  set 
semantics.  Of  course,  smodels  [8],  a  state-of-the-art  system  for  normal  logic 
programs,  could  also  be  employed  as  underlying  reasoning  engine. 

Given  a  sequence  of  update  programs  as  input,  upd  first  translates  this  se¬ 
quence  into  a  single  extended  logic  program,  P<,  and  then  invokes  DLV  to  cal¬ 
culate  the  answer  sets  of  P<] .  In  order  to  obtain  update  answer  sets  of  the  given 
input  sequence,  the  special-purpose  atoms  introduced  by  the  translation  are  fil¬ 
tered  from  the  answer  sets  of  P<3. 

For  dealing  with  minimal  and  strictly  minimal  update  answer  sets,  upd 
employs  a  two-phase  evaluation  approach.  The  overall  algorithm  for  calculat- 
ing  minimal  update  answer  sets  is  depicted  in  Figure  1.  Roughly  speaking, 
the  algorithm  proceeds  as  follows:  First,  the  answer  sets  of  the  update  pro¬ 
gram  P<3  are  calculated.  As  soon  as  an  answer  set  S  is  produced  (denoted  by 
Next-Answer ^Set{P<i)) ,  it  is  tested  for  being  minimal  by  calculating  the  answer 
sets  of  a  particular  test  program,  consisting  of  the  rules  of  P<  together 

with  a  set  of  additional  rules.  S  is  minimal  iff  the  test  program  P^^’^  has  no 
answer  set.  The  algorithm  for  strictly  minimal  update  answer  sets  is  analogous, 
the  only  difference  is  that  the  test  program  Pf^"^  is  replaced  by  a  suitable  test 
program 
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Algorithm  Compute_Minimai_Models(P) 

Input:  A  sequence  of  ELPs  P  =  (Pi .  ,Pn). 
Output:  All  minimal  answer  sets  of  P. 

var  S  :  AnswerSet; 

var  MinModels  :  Set -Of  .Answer Sets; 

S  :=  Next-Answer. Set{P<i); 
while  S  ^  nil  do 

var  Counter  :  Set.Of.  Answer  Sets; 
Counter  :=  Compute-Answer  Sets 
if  {Counter  =  0)  then 

MinModels  MinModels  U  {5}; 
fi 

S  :=  Next -Answer -Set {Pa); 

od 

return  MinModels; 


Fig,  1,  Algorithm  to  calculate  minimal  update  answer  sets 


2.2  Applying  the  System 

The  general  syntax  of  upd  coincides  with  the  syntax  of  DLV.  Update  sequences 
are  represented  by  grouping  rules  using  the  braces  “{”  and  as  illustrated 
by  the  following  example: 

{sleep  night,  not  tv_on. 
watch_tv  tv_on. 
night . 
tv_on.} 

{-tv_on  :-  power_failiu:e. 
power ^failure . } 

This  input  represents  an  update  sequence  (Pi,  P2),  where  the  first  group  of  rules 
constitutes  the  initial  knowledge  base,  Pi ,  and  the  second  group  corresponds  to 
the  update  information  P2. 

Intuitively,  the  above  example  expresses  the  following  situation:  The  initial 
program  specifies  that  someone  sleeps  at  night  unless  the  TV  is  on,  in  which 
case  the  person  is  watching  TV.  This  knowledge  is  updated  by  the  information 
that  the  TV  is  not  on  providing  there  is  a  power  failure,  and  there  is  actually  a 
power  failure. 

The  program  upd  processes  inputs  either  in  the  form  of  files  or  as  immediate 
input  via  a  command  shell.  Supposing  the  above  sequence  of  programs  has  been 
saved  in  a  file  named  tv.lps,  the  computation  of  the  corresponding  update 
answer  sets  can  be  engaged  by  the  command  “upd”,  producing  the  following 
output: 
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>  upd  -p=‘'/bin/dlv  tv.lps 

upd  [build  BEN/Nov  15  2000  gcc  2.95.2  19991024  (release)] 

dlv  [build  BEN/Jun  11  2001  gcc  2.95.2  19991024  (release)] 

{night,  power.failure,  sleep,  -tv_on} 

Observe  that  upd  requires  the  explicit  specification  which  particular  prover 
should  be  invoked  during  the  computation  process.  This  choice  is  determined  by 
the  option  -p,  which  allows  for  selecting  alternate  evaluation  tools  besides  DLV. 

It  is  also  possible  to  feed  upd  with  multiple  inputs.  For  instance,  suppose 
we  have  another  file,  say  tv.cont.lps,  containing  the  following  sequence  of 
programs: 

{-power .failure . } 

{switched.off  not  tv_on,  not  power.f ailure . 
tv_on  not  switched.off ,  not  power.f allure . 

-tv_on  switched.of f . } 

If  these  programs  are  assumed  to  update  the  information  given  by  file  tv .  Ips, 
the  update  answer  sets  of  the  overall  sequence  (comprised  of  four  programs)  can 
be  computed  as  follows: 

>  upd  -silent  -p=~/bin/dlv  -o=-silent  tv.lps  tv.cont.lps 
{-tv.on,  night,  switched.of f ,  -power.failure,  sleep} 

{tv.on,  watch.tv,  night,  -power .failure} 

Here,  options  -silent  and  -o=-silent  have  been  invoked  to  suppress  any 
additional  upd  and  DLV  messages,  where  -o  allows  to  pass  options  to  the  em¬ 
ployed  evaluation  program  (DLV  in  the  present  case). 

Further  options  of  upd  are  -min  and  -strict,  which  specify  whether  minimal 
or  strictly  minimal  update  answer  sets  should  be  computed,  respectively.  For 
instance,  if  we  are  interested  in  computing  the  minimal  update  answer  sets  of 
the  sequence  given  by  the  files  tv.lps  and  tv.cont.lps,  we  may  call  upd  as 
follows: 

>  upd  -min  -silent  -p=~/bin/dlv  -o=-silent  tv.lps  tv.cont.lps 
{tv.on,  watch.tv,  night,  -power. failure} 

Finally,  upd  can  be  downloaded  from  the  Web  at 

http : //www . kr . tuwien . ac . at/staf f /giuliana/pro j  ect .html. 
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Abstract.  We  present  an  implementation  of  an  answer-set  program¬ 
ming  paradigm,  called  aspps  (short  for  answer-set  programming  with 
propositional  schemata).  The  system  aspps  is  designed  to  process  PS'^- 
theories.  It  consists  of  two  basic  modules.  The  first  module,  psgmd, 
grounds  an  PiS^-theory.  The  second  module,  referred  to  as  aspps^  is  a 
solver.  It  computes  models  of  ground  P5'‘^-theories. 


1  Introduction 

The  most  advanced  answer-set  programming  systems  are,  at  present,  smod- 
els  [NSOO]  and  dlv  [ELM‘^98].  They  are  based  on  the  formalisms  of  logic  pro¬ 
gramming  with  stable-model  semantics  and  disjunctive  logic  programming  with 
answer-set  semantics,  respectively.  We  present  an  implementation  of  an  answer- 
set  programming  system,  aspps  (short  for  answer-set  programming  with  propo¬ 
sitional  schemata) .  It  is  based  on  the  extended  logic  of  propositional  schemata 
with  closed  world  assumption  that  we  denote  by  PS^ ,  We  introduced  this  logic 
in  [ETOl]. 

A  theory  in  the  logic  PS^  is  a  pair  (Z),  Z^),  where  D  is  a  collection  of  ground 
atoms  representing  a  problem  instance  (input  data),  and  P  is  a  program  — 
a  collection  of  PiS^-clauses  (encoding  of  a  problem  to  solve).  The  meaning  of 
a  P5+-theory  T  =  (D,P)  is  given  by  a  family  of  P5+-models  [ETOl].  Each 
model  in  this  family  represents  a  solution  to  a  problem  encoded  by  P  for  data 
instance  D. 

The  system  aspps  is  designed  to  process  P5'*'-theories.  It  consists  of  two 
basic  programs.  The  first  of  them,  psgmd,  grounds  a  P^'^-theory.  That  is,  it 
produces  a  ground  (propositional)  theory  extended  by  a  number  of  special  con¬ 
structs.  These  constructs  help  model  cardinality  constraints  on  sets.  The  second 
program,  referred  to  as  aspps,  is  a  solver.  It  computes  models  of  grounded  PS^- 
theories.  It  is  designed  along  the  lines  of  a  standard  Davis-Putnam  algorithm  for 
satisfiability  checking.  Both  psgmd  and  aspps,  examples  of  P5'^-programs  and 
the  corresponding  performance  results  are  available  at 
http : //www . cs . uky . edu/ai/aspps/. 
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2  PS-^-Theories 

A  PS'^ -theory  is  a  pair  (D,  P),  where  P  is  a  collection  of  ground  atoms  and  P 
is  a  collection  of  P5 "'■-clauses.  Atoms  in  D  represent  input  data  (an  instance  of 
a  problem).  In  our  implementation  these  atoms  may  be  stored  in  one  or  more 
data  files.  The  set  of  P5'"^-clauses  models  the  constraints  (specification)  of  the 
problem.  In  our  implementation,  all  the  P5'^-clauses  in  P  are  stored  in  a  single 
rule  file. 

All  statements  in  data  and  rule  files  must  end  with  a  period  (.).  Clauses 
may  be  split  across  several  lines.  Blank  lines  can  be  used  in  data  and  rule  files 
to  improve  readability.  Comments  may  be  used  too.  They  begin  with  ‘%’  and 
continue  to  the  end  of  the  line. 

Data  files.  Each  ground  atom  in  a  data  file  must  be  given  on  a  single  line. 
Constant  symbols  may  be  used  as  arguments  of  ground  atoms.  In  such  cases, 
these  constant  symbols  must  be  specified  at  the  command  line  (see  Section  3). 
Examples  of  ground  atoms  are  given  below: 
vtx{2). 
vtx{3). 
size{k). 

A  set  of  ground  atoms  of  the  form  {p(m),p(m  -1- 1), . . .  ,p(n)},  where  m  and  n 
are  non-negative  integers  or  integer  constants  specified  at  the  command  line,  can 
be  represented  in  a  data  file  as  ‘p[m..n].’.  Thus,  the  two  ground  atoms  vtx(2) 
and  vtx{^)  can  be  specified  as  ^vtx[1..2].\ 

Predicates  used  by  ground  atoms  in  data  files  are  called  data  predicates. 

Rule  files.  The  rule  file  of  a  P5"''-theory  consists  of  two  parts.  In  the  first  one, 
the  preamble^  we  declare  all  program  predicates,  that  is,  predicates  that  are  not 
used  in  data  files.  We  also  declare  types  of  all  variables  that  will  be  used  in 
the  rule  files.  Typing  of  variables  simplifies  the  implementation  of  the  grounding 
program  psgmd  and  facilitates  error  checking. 

Arguments  of  each  program  predicate  are  typed  by  unary  data  predicates 
(the  idea  is  that  when  grounding,  each  argument  can  only  be  replaced  by  an 
element  of  an  extension  of  the  corresponding  unary  data  predicate  as  specified 
by  the  data  files).  A  program  predicate  q  with  n  arguments  of  types  dpi, . . . ,  dpn^ 
where  all  dpi  are  data  predicates,  is  declared  in  one  of  the  following  two  ways: 

pred  q{dpi,...,dpn)^ 
pred  q{dpi, . . . ,  dpn)  '  dpm^ 

In  the  second  statement,  the  n-ary  data  predicate  dpm  further  restricts  the  ex¬ 
tension  of  g  —  it  must  be  a  subset  of  the  extension  of  dpm  (as  specified  by  the 
data  files). 

Variable  declarations  begin  with  the  keyword  var.  It  is  followed  by  the  unary 
data  predicate  name  and  a  list  of  alpha-numeric  strings  serving  as  variable  names 
(they  must  start  with  a  letter).  Thus,  to  declare  two  variables  X  and  Y  of  type 
dp,  where  dp  is  a  unary  data  predicate  we  write: 


var  dp  X,Y. 
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The  implementation  allows  for  predefined  predicates  and  function  symbols 
such  as  the  equality  operator  =— ,  arithmetic  comparators  <=,  >— ,  <  and 
>,  and  arithmetic  operations  *  ,/,  a65()  (absolute  value),  mod{Nyb), 

m(ix{X,Y)  and  min{XjY).  We  assign  to  these  symbols  their  standard  inter¬ 
pretation.  However,  we  emphasize  that  the  domains  are  restricted  only  to  those 
constants  that  appear  in  a  theory. 

The  second  part  of  the  rule  file  contains  the  program  itself,  that  is,  a  collection 
of  clauses  describing  constraints  of  the  problem  to  be  solved. 

By  a  term  tuple  we  mean  a  tuple  whose  each  component  is  a  variable  or  a 
constant  symbol,  or  an  arithmetic  expression.  An  atom  is  an  expression  of  one 
of  the  following  four  forms. 

1.  p{t),  where  p  is  a  predicate  (possibly  a  predefined  predicate)  and  t  is  a  tuple 
of  variables,  constants  and  arithmetic  expressions. 

2.  p{t,Y)  :  dp{Y),  where  p  is  a  program  predicate,  t  is  a  term  tuple,  and  dp  is 
a  unary  data  predicate 

3.  m{p{t)  :  di{ti)  :  :  dk{tk)}n,  where  p  is  a  program  predicate,  each  di  is  a 

data  or  a  predefined  predicate,  and  t  and  all  U  are  term  tuples 

4.  m{pi(t), . . .  ,pfc(i)}n,  where  all  pi  are  program  predicates  and  i  is  a  term 
tuple 

Atoms  of  the  second  type  are  called  e-atoms  and  atoms  of  types  3  and  4  are 
called  c-atoms.  Intuitively,  an  e-atom  ‘p(t,F)  :  dp{Yy  stands  for  ‘there  exists  Y 
in  the  extension  of  the  data  predicate  dp  such  that  p(t,  Y)  is  true’.  An  intuitive 
meaning  of  a  c-atom  ‘m{p(t)  :  di(ti)  :  . . .  :  dk{tk)}n'‘  is:  from  the  set  of  all  atoms 
p(t)  such  that  for  every  1  <  i  <  fc,  di{t^''^)  is  true  (t^  *  is  a  projection  of  t 
onto  attributes  of  dj),  at  least  m  and  no  more  than  n  are  true.  The  meaning  of 
a  c-atom  ‘m{pi(t), . . .  ,pA;(t)}n’  is  similar:  at  least  m  and  no  more  than  n  atoms 
in  the  set  {pi(t), . . .  ,pfc(t)}  are  true. 

We  are  now  ready  to  define  clauses.  They  are  expressions  of  the  form 

}  •  •  •  )  -^m  ^  -Hi  [  .  .  .  I  Bn . 

where  Aj’s  and  Bfis  are  atoms,  stands  for  the  conjunction  operator  and  ‘|’ 
stands  for  the  disjunction  operator. 

3  Processing  P5“^-Theories 

To  compute  models  of  a  P5+-theory  (H,  P)  we  first  ground  it.  To  this  end, 
we  use  the  program  psgmd.  Next,  we  compute  models  of  the  ground  theory 
produced  by  psgmd.  To  accomplish  this  task,  we  use  the  program  aspps.  For  the 
detailed  description  of  the  grounding  process  and,  especially,  for  the  treatment 
of  e-atoms  and  c-atoms,  and  for  a  discussion  of  the  design  of  the  aspps  program, 
we  refer  the  reader  to  [ETOl]. 

The  required  input  to  execute  psgmd  is  a  single  program  file,  one  or  more 
data  files  and  optional  constants.  If  no  errors  are  found  while  reading  the  files 


aspps  -  An  Implementation  of  Answer-Set  Programming  405 

and  during  grounding,  an  output  file  is  constructed.  The  output  file  is  a  machine 
readable  file  whose  name  is  a  catenation  of  the  constants  and  file  names  with 
the  extension  .tdc. 

psgmd  -r  rfile  -d  dfilel  dfile2  ...  [-c  cl=vl  c2=v2  . . .] 

Required  arguments 

-r  rfile  is  the  file  describing  the  problem  (rule  file).  There  must  be  exactly  one 
rule  file. 

-d  datafilelist  is  one  or  more  files  containing  data  that  will  be  used  to  instan¬ 
tiate  the  theory. 

Optional  arguments 

-c  name— value  This  option  allows  the  use  of  constants  in  both  the  data  and 
rule  files.  When  name  is  found  while  reading  input  files  it  is  replaced  by 
value;  value  can  be  any  string  that  is  valid  for  the  data  type.  If  name  is 
to  be  used  in  a  range  specification,  then  value  must  be  an  integer. 

The  program  aspps  is  used  to  solve  the  grounded  theory  constructed  by 
psgmd.  The  name  of  the  file  containing  the  theory  is  input  on  the  command 
line.  After  executing  the  aspps  program,  a  file  named  aspps.stat  is  created  or 
appended  with  statistics  concerning  this  run  of  aspps. 
aspps  -f  filename  [-A]  [-P]  [-C  [x]]  [-S  name] 

Required  arguments 

-f  filename  is  the  name  of  the  file  containing  a  theory  produced  by  psgmd. 
Optional  arguments 

-A  Prints  the  positive  atoms  for  solved  theories  in  readable  form. 

-P  Prints  the  input  theory  and  then  exits. 

-C  [x]  Counts  the  number  of  solutions.  This  information  is  recorded  in  the 
statistics  file.  If  x  is  specified  it  must  be  a  positive  integer;  aspps  stops  after 
finding  x  solutions  or  exhausting  the  whole  search  space,  whichever  comes 
first. 

-S  name  Show  positive  atoms  with  predicate  name. 
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1  Introduction 

NoMoRe  implements  answer  set  semantics  for  normal  logic  programs.  It  realizes  a 
novel  paradigm  to  compute  answer  sets  by  computing  a~  colorings  (non-standard 
graph  colorings  with  two  colors)  of  the  block  graph  (a  labeled  digraph)  associated 
with  a  given  program  P  (see  [5]  for  details).  Intuitively,  an  a-coloring  reflects  the 
set  of  generating  rules  for  an  answer  set,  which  means  that  noMoRe  is  rule-based 
and  not  atom-based  like  most  of  the  other  known  systems.  Since  the  core  system 
was  designed  for  propositional  programs  only,  we  have  integrated  Iparse  [8]  as 
a  grounder  in  order  to  deal  with  variables.  Furthermore,  we  have  included  an 
interface  to  the  graph  drawing  tool  DaVinci  [6]  for  visualization  of  block  graphs. 
This  allows  for  a  structural  analysis  of  programs. 

The  noMoRe-system  is  implemented  in  the  programming  language  Prolog; 
it  has  been  developed  under  the  ECLiPSe  Constraint  Logic  Programming  Sys¬ 
tem  [1]  and  it  was  also  successfully  tested  with  SWI-Prolog  [9].  The  system  is 
available  at  http://www.es. uni-potsdam. de/''linke/nomore.  In  order  to  use  the  sys¬ 
tem,  ECLiPSe-  or  SWI-Prolog  is  needed  [1,9]^. 

2  Theoretical  Background 

The  current  prototype  of  the  noMoRe  system  implements  nonmonotonic  reasoning 
with  normal  logic  programs  under  answer  set  semantics  [4] .  We  consider  rules  r 
of  the  form  p  <?i, . . . ,  riot  5i, . . . ,  not  Sk  where  p,  qi  (0<Kn)  and  Sj  {0^<k) 
are  atoms,  head{r)  =  p,  hody'^{r)  =  hody~{r)  ^  {si,...,sa:}  and 

body{r)  =  body'^{r)  U  body~{r). 

Look  at  the  following  normal  logic  program 

P  =  {a  b,not  e.  b  ^  d.  c^b.  d  .  e  ^  d,not  f.  /  <-  a.}  (1) 

Let  us  call  the  rules  of  program  (1)  ra,  r^,  rc,  re,  and  r/,  respectively.  P  has 
the  answer  sets  Ai  =  {d,  6,  c,  a,  /}  and  A2  =  {d,  6,  c,  e}.  It  is  easy  to  see  that  the 
application  of  r/  blocks  the  application  of  re  wrt  Ai,  because  if  r/  contributes 
to  Ai,  then  /  G  Ai  and  thus  r^  cannot  be  applied.  Analogously,  re  blocks  ra  wrt 
answer  set  A2.  This  observation  leads  us  to  a  strictly  blockage-based  approach. 
The  block  graph  of  program  P  is  a  directed  graph  on  the  rules  of  P: 

^  Both  Prolog  systems  are  freely  available  for  scientific  use. 

T.  Eiter,  W.  Faber,  and  M.  Truszczynski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  406-410,  2001. 
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Definition  1.  ([5])  Let  P  be  a  logic  program  and  let  P'  Q  P  be  maximal 
grounde(P.  The  block  graph  Pp  =  {Vp,A%  \J  A],)  of  P  is  a  directed  graph  with 
vertices  Vp  =  P  and  two  different  kinds  of  arcs 

Ap  =  {(r',r)  I  r',r  G  P'  and  head{r')  G  body'^{r)} 

Ap  ~  {{r\r)  I  r',r  G  P'  and  head\r’)  G  body~{r)}. 

Figure  1  shows  the  block  graph  of  program  (1).  Since  groundedness  (by  def¬ 
inition)  ignores  negative  bodies,  there  exists  a  unique  maximal  grounded  set 
P'  C  P  for  each  program  P,  that  is,  Pp  is  well-defined.  Definition  1  captures  the 
conditions  under  which  a  rule  r'  blocks  another  rule  r  (i.e.  (r',  r)  G  A^).  We  also 
gather  all  groundedness  information  in  Pp,  due  to  the  restriction  to  rules  in  the 
maximal  grounded  part  of  P.  This  is  essential  because  a  block  relation  between 
two  rules  r'  and  r  becomes  effective  only  if  r'  is  groundable  through  other  rules. 
Therefore  Pp  captures  all  information  necessary  for  computing  the  answer  sets 
of  program  P. 

Answer  sets  then  are  characterized  as  special  non-standard  graph  colorings 
of  block  graphs.  We  denote  0-predecessors,  0-successors,  1-predecessors  and  1- 
successors  of  Pp  by  7^(^)5  7r(^)  7i^(^)  for  t;  G  V,  respectively. 

Definition  2.  ([5])  Let  P  be  a  logic  program,  s.t  \body'^{r)\  <  1  for  each  r  e  P, 
let  Pp  =  (P,  ApUAp)  be  the  corresponding  block  graph  and  letc  \  P  ^  {©,  ©}  be 
a  mapping.  Then  c  is  an  a-coloring  (application- coloring)  of  Pp  iff  the  following 
conditions  hold  for  each  r  £  P^ 

A1  c(r)  =  Q  iff  one  of  the  following  conditions  holds 
a.  7^(r)  ^  0  and  for  each  P  G  7^  (r)  we  have  c(r')  =  © 
h.  there  is  some  r”  G  7i"(r)  s.t.  c(r")  =  ©. 

A2  c(r)  =  ®  iff  both  of  the  following  conditions  hold 

a.  7^(t’)  =  0  or  it  exists  grounded  0-path  Gr  s.t.  c{Gr)  =  © 

b.  for  each  r"  G  7i*(r)  we  have  c{r")  =  ©. 

Observe,  that  there  are  programs  (e.g.  P  =  {p  •<—  not  p})  s.t.  no  a-coloring  exists 
for  Pp.  Intuitively,  each  node  of  the  block  graph  (corresponding  to  some  rule)  is 
colored  with  one  of  two  colors,  representing  application  (©)  or  non-application 
(©)  of  the  corresponding  rule.  The  coloring  presented  in  Figure  1  corresponds 
to  answer  set  Ai  of  P.  Node  (rule)  rg  has  to  be  colored  ©  (not  applied),  because 
there  is  some  1-predecessor  of  rg  colored  ©  (applied).  In  other  words,  r/  blocks  rg. 


^  A  set  of  rules  S  is  grounded  iff  there  exists  an  enumeration  {ri)iei  of  S  such  that  for 
alH  G  /  we  have  that  body'^{ri)  C  head({ri,  •  •  • ,  ri_i}).  A  maximal  grounded  set  P' 
is  a  grounded  set  that  is  maximal  wrt  set  inclusion.  We  generalize  the  definition  of 
the  head  of  a  rule  to  sets  of  rules  in  the  usual  way. 

^  A  subset  of  rules  Gr  C  P  is  a  grounded  0-path  for  r  G  P  if  Gr  is  a  0-path  from  some 
fact  to  r  in  Pp.  For  a  set  of  rules  S'  C  P  we  write  c(S)  =  ©  or  c(S)  =  0  if  for  each 
r  G  S  we  have  c(r)  =  ©  or  c(r)  =  0,  respectively.  For  the  generalization  of  condition 
\body'^{r)\  <  1  see  [5].  There  you  can  also  find  further  details  on  a-colorings  and 
the  algorithm  to  compute  them. 
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Fig.  1.  Block  graph  of  program  (1  with  a-coloring  corresponding  to  answer 
set  Al) 

3  Description  of  the  System 

NoMoRe  uses  a  compilation  technique  to  compute  answer  sets  of  a  logic  program  P 
in  three  steps  (see  Figure  2).  At  first,  the  block  graph  Pp  is  computed.  Secondly, 
Pp  is  compiled  into  Prolog  code  in  order  to  obtain  an  efficient  coloring  procedure. 
The  compiled  Prolog  code  is  then  used  to  actually  compute  the  answer  sets. 
To  read  logic  programs  we  use  a  parser  (eventually  after  running  Iparse)  and 
there  is  a  separate  part  for  interpretation  of  a-colorings  into  answer  sets.  For 
information  purpose  there  is  yet  another  part  for  visualizing  block  graphs  using 
the  graph  drawing  tool  DaVinci  [6].  The  noMoRe  system  is  used  for  purposes 


Fig.  2.  The  architecture  of  noMoRe 


of  research  on  the  underlying  paradigm.  But  even  in  this  early  state,  usability 
for  anybody  familiar  with  the  logic  programming  paradigm  is  given.  The  syntax 
accepted  by  noMoRe  is  Prolog-like.  For  example,  the  first  rule  of  program  (1)  is 
represented  through  a  b,  not  e. 

4  Evaluating  the  System 

As  a  first  benchmark,  we  used  two  NP-complete  problems  proposed  in  [2]:  the 
problem  of  finding  a  Hamiltonian  path  in  a  graph  (Ham)  and  the  independent 
set  problem  (Ind).  In  terms  of  time  used  for  computing  answer  sets,  our  first 
Prolog  implementation  is  not  comparable  with  state  of  the  art  0/0-1-+  imple¬ 
mentations,  e.g.  smodels  [7]  and  dlv  [3].  Therefore  we  compare  the  number  of 
used  choice  points,  because  it  reflects  how  an  algorithm  deals  with  the  exponen¬ 
tial  part  of  a  problem.  Unfortunately,  only  smodels  gives  information  about  its 
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Table  1.  Number  of  choice  points  for  HAM~problems  of  complete  graphs  with  n 
nodes 


all  solutions  for  Ham  of  Kn 

one  solution  for  Ham  of  Kn 

n  = 

7 

8 

9 

10 

IB 

D 

B 

B 

MM 

11 

12 

smodels 

4800 

86364 

1864470 

45168575 

3 

4 

8 

48 

1107 

18118 

398306 

noMoRe 

14335 

115826 

1160533 

7864853 

16 

20 

i 

34 

58 

69 

79 

choice  points.  For  this  reason,  we  have  concentrated  on  comparing  our  approach 
with  smodels. 

Results  are  given  for  finding  all  solutions  of  diEerent  instances  of  Ham  and 
Ind.  Table  1  shows  results  for  some  Ham-encodings  of  complete  graphs  Kn 
where  n  is  the  number  of  nodes^.  Surprisingly,  it  turns  out  that  noMoRe  performs 
very  good  on  this  problem  class.  That  is,  with  growing  problem  size  we  need  less 
choice  points  (and  less  time)  than  smodels.  This  can  also  be  seen  in  Table  2  which 
shows  the  corresponding  time  measurements.  To  be  fair,  for  Ind-problems  of 
graphs  Cirn^we  need  more  choice  points  (and  much  more  time)  smodels  needs. 
However,  even  with  the  same  number  of  choice  points  smodels  is  faster  than 
noMoRe,  because  noMoRe  uses  general  backtracking  of  prolog,  whereas  smodels 
backtracking  is  highly  specialized  for  computing  answer  sets.  The  same  applies 
to  dlv.  Even  so,  it  is  clear  that  our  approach  is  a  very  promising  one. 


Table  2.  Time  measurements  in  seconds  with  ECLiPSe  Prolog  for  HAM-  and 
IND-problems  on  a  SUN  Ultra2  with  two  300MHz  Sparc  processors  (compilation 
time  not  included) 


:  1 

Ham  of  Kn 

Ind  of  Chn 

all  solutions 

one  solution 

all  solutions 

8 

9 

10 

5 

6 

7 

8 

9 

10 

11 

12 

40 

50 

60 

smodels 

54 

1334 

38550 

0.01 

0.02 

0.04 

0.04 

0,11 

1.61 

24 

526 

8 

219 

4052 

dlv 

4 

50 

493 

0.02 

0.03 

0.03 

0.05 

0.06 

0.07 

0.09 

0.15 

13 

259 

4594 

noMoRe 

208 

2556 

21586 

0.01 

0.02 

0.06 

0.12 

0.25 

0.40 

0.53 

1.02 

39 

706 

12767 
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Abstract.  This  paper  describes  a  generic  compiler,  called  pip,  for  trans¬ 
lating  ordered  logic  programs  into  standard  logic  programs  under  the 
answer  set  semantics.  In  an  ordered  logic  program,  preference  informa¬ 
tion  is  expressed  at  the  object  level  by  atoms  of  the  form  s  -K  t,  where  s 
and  t  are  names  of  rules.  An  ordered  logic  program  is  transformed  into 
a  second,  regular,  extended  logic  program  wherein  the  preferences  are 
respected,  in  that  the  answer  sets  obtained  in  the  transformed  theory 
correspond  with  the  preferred  answer  sets  of  the  original  theory.  Cur¬ 
rently,  pip  treats  three  different  types  of  preference  strategies,  viz.  those 
proposed  by  (i)  Brewka  and  Eiter,  (ii)  Delgrande,  Schaub,  and  Tom- 
pits,  and  (iii)  Wang,  Zhou,  and  Lin.  Since  the  result  of  the  translation  is 
an  extended  logic  program,  existing  logic  programming  systems  can  be 
used  as  underlying  reasoning  engine.  In  particular,  pip  is  conceived  as  a 
front-end  to  the  logic  programming  systems  div  and  smodels. 


1  General  Information 

Several  approaches  have  been  introduced  in  recent  years  for  expressing  preference 
information  within  declarative  knowledge  representation  formalisms  [7,11,1,10]. 
However,  most  of  these  methods  treat  preferences  at  the  meta-level  and  require  a 
change  of  the  underlying  semantics.  As  a  result,  implementations  need  in  general 
fresh  algorithms  and  cannot  rely  on  existing  systems  computing  the  regular 
(unordered)  formalisms. 

In  this  paper,  we  describe  the  system  pip,  which  avoids  the  need  of  new 
algorithms,  while  computing  preferred  answer  sets  of  an  ordered  logic  program, 
pip  is  based  on  an  approach  for  expressing  preference  information  within  the 
framework  of  standard  answer  set  semantics  [6],  and  is  conceived  as  a  front-end 
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liable  1.  The  syntax  of  p!p  input  files 


Meaning 

Symbols 

Internal 

J-,T 

false/0,  true/0 

neg/1,  -/I  (prefix) 

neg_L,  L£  C 

not 

not/1,  r^/1  (prefix) 

A 

,/l  (infix;  in  body) 

V 

;/l,  v/2,  1/2  (infix;  in  head) 

:  “/ 1  (infix;  in  rule) 

</2  (infix) 

prec/2 

Ur  :  {head{r))  ^  {body{r)) 

{head{r))  naine(nr),  {6ody(r)) 

{head{r))  [nr],{body(r)) 

ok,  rdy 

ok/l,  rdy/2 

ap,  bl 

ap/1,  bl/1 

to  the  logic  programming  systems  div  [5]  and  smodels  [8].  The  general  technique 
is  described  in  [4]  and  derives  from  a  methodology  for  addressing  preferences  in 
default  logic  first  proposed  in  [2]. 

We  begin  with  an  ordered  logic  program^  which  is  an  extended  logic  program 
in  which  rules  axe  named  by  unique  terms  and  in  which  preferences  among 
rules  are  given  by  a  new  set  of  atoms  of  the  form  s  ■<  i,  where  s  and  t  are 
names.  Such  an  ordered  logic  program  is  then  transformed  into  a  second,  regular, 
extended  logic  program  wherein  the  preferences  are  respected,  in  the  sense  that 
the  answer  sets  obtained  in  the  transformed  theory  correspond  to  the  preferred 
answer  sets  of  the  original  theory.  The  transformation  is  realized  by  adding 
sufficient  control  elements  to  the  rules  of  the  given  ordered  logic  program  which 
guarantee  that  successive  rule  applications  are  in  accord  with  the  intended  order. 
More  specifically,  the  transformed  program  contains  control  atoms  ap(-)  and  bl(*), 
which  detect  when  a  rule  has  been  applied  or  blocked,  respectively,  as  well  as 
auxiliary  atoms  ok(-)  and  rdy(-,  •)  which  control  the  applicability  of  rules  based 
on  antecedent  conditions  reflecting  the  given  order  information. 

The  approach  is  sufficiently  general  to  allow  the  specification  of  prefer¬ 
ences  among  preferences,  preferences  holding  in  a  particular  context,  and  prefer¬ 
ences  holding  by  default.  Moreover,  the  approach  permits  a  generic  compilation 
methodology,  making  it  possible  to  express  differing  preference  strategies.  Ba¬ 
sically,  this  is  achieved  by  varying  the  specific  antecedent  conditions  for  the 
control  atoms  ok(-)  and  rdy(-,  •).  Currently,  pip  treats  three  kinds  of  preference 
strategies,  viz.  those  proposed  by  Brewka  and  Eiter  [1],  Delgrande,  Schaub,  and 
Tompits  [2,4],  and  Wang,  Zhou,  and  Lin  [10]. 

2  Applying  the  System 

The  syntax  of  pip  is  summarised  in  Table  1.  An  example  file  comprising  an 
ordered  logic  program  is  the  following: 
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Fig.  1.  Compilation  with  pip:  external  view 


neg  a  . 

b  iiame(ii2),  neg  a,  not  c. 
c  :  -  name  (n3) ,  not  b . 

(n3  <  n2)  not  d. 

Here,  naiiie(n2)  and  name(n3)  serve  as  names  for  the  rules  in  which  these  terms 
occur,  and  the  last  rule  expresses  that  the  rule  named  n2  is  preferred  over  the 
rule  named  n3,  in  case  atom  d  cannot  be  inferred. 

Once  this  file,  say  example. Ip,  is  read  into  pip,  it  is  subject  to  multiple 
transformations.  Most  of  these  transformations  are  rule-centered  in  the  sense 
that  they  apply  in  turn  to  each  single  rule.  The  first  phase  of  the  compilation  is 
system-independent  and  corresponds  to  the  transformations  given  in  [4].  While 
the  original  file  is  supposed  to  have  the  extension  Ip,  the  result  of  the  system- 
independent  compilation  phase  is  kept  in  an  intermediate  file  with  extension  pi 
(e.g.,  example.pl). 

While  this  compilation  phase  can  be  engaged  explicitly  by  the  command 
lp2pl/l,  one  is  usually  interested  in  producing  system-specific  code  that  is  di¬ 
rectly  usable  by  either  div  or  smodels.  This  can  be  done  by  means  of  the  com¬ 
mands  lp2dlv/l  and  lp2sm/l,^  which  then  produce  system-specific  code  result¬ 
ing  in  files  having  extensions  dlv  and  sm,  respectively.  These  files  can  then  be  fed 
into  the  respective  system  by  a  standard  command  interpreter,  such  as  a  UNIX 
shell,  or  from  within  the  Prolog  system  through  commands  dlv/1  or  smodels/ 1. 
For  example,  after  compiling  our  example  by  lp2dlv,  we  may  proceed  as  follows: 

1  ?-  dlv( * Examples/example O . 

Calling  :dlv  Examples /example. dlv 

dlv  [build  BEN/Jun  11  2001  gcc  2.95.2  19991024  (release)] 

^  These  files  are  themselves  obtainable  from  the  intermediate  pl-files  via  commands 
pl2dlv/l  and  pl2sm/l,  respectively. 
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{true,  iiaine(ii2),  iiame(ii3),  iieg_a,  ok(n2),  rdy(n2,ii2) , 
rdy(n2,n3),  rdy(n3,iL3),  prec(ii3,n2),  neg_prec(n2,n3) , 
ap(n2),  b,  rdy(ii3,n2),  ok(ii3) ,  bl(n3)} 

Both  commands  can  be  furnished  with  the  option  nice  (as  an  additional  argu¬ 
ment)  in  order  to  strip  ojff  the  auxiliary  predicates: 

I  ?-  dlvC^Examples/example^nice)  . 

Calling  ;dlv  “filter=a  [...]  “f ilter=neg_d  Examples/example . dlv 
dlv  [build  BEN/Jun  11  2001  gcc  2.95.2  19991024  (release)] 

{neg_a,  b} 

The  above  series  of  commands  can  be  engaged  within  a  single  one  by  means 
of  lp4dlv/l  and  lp4sm/l,  respectively.  Moreover,  for  changing  the  underlying 
preference  strategy,  a  simple  patch  is  executed,  which  redefines  certain  predi¬ 
cates.  The  overall  (external)  comportment  of  p(p  is  illustrated  in  Figure  1. 

For  treating  variables,  some  additional  preprocessing  is  necessary  for  instan¬ 
tiating  the  rules  before  they  are  compiled.  The  presence  of  variables  is  indicated 
by  file  extension  vlp.  The  content  of  such  a  file  is  first  instantiated  by  system¬ 
atically  replacing  variables  by  constants  and  then  freed  from  function  symbols 
by  replacing  terms  by  constants,  e.g.,  f  (a)  is  replaced  by  f_a.  This  is  clearly  a 
rather  pragmatic  approach.  A  more  elaborated  compilation  would  be  obtained 
by  proceeding  right  from  the  start  in  a  system-specific  way. 

Finally,  the  current  prototype  is  available  at 
http :  / /www .  cs .  imi-potsdam .  de/“'torsten/plp/. 
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1  Introduction 

A  recent  paper  [1]  laid  out  the  theoretical  basis  for  effective  reasoning  with  infi¬ 
nite  stable  models  and  normal  programs  with  function  symbols.  For  the  class  of 
finitary  programs  introduced  there,  ground  queries  are  decidable  and  nonground 
queries  are  semi-decidable  under  both  credulous  and  skeptical  stable  model  se¬ 
mantics.  Finitary  programs  are  expressive  enough  to  simulate  any  given  Turing 
machine.  In  order  to  exploit  the  potential  expressiveness  of  finitary  programs,  a 
family  of  tools  is  needed,  including: 

1.  Tools  for  automatic  recognition  of  finitary  programs.  This  task  can  only 
be  approximated,  as  the  class  of  finitary  programs  is  not  decidable.  For  this 
purpose,  we  use  techniques  related  to  abstract  interpretations  and  automated 
program  analysis. 

2.  Front  ends  that  given  a  finitary  program  P  and  a  query  Q  construct  the 
finite  fragment  of  G round (P)  needed  to  answer  Q.  The  fragment  (whose 
existence  is  proved  in  [1])  is  meant  to  be  fed  to  credulous  engines  such  as 
SMODELS  [4]. 

In  this  note  we  introduce  prototypes  of  the  above  tools  implemented  in  XSB 
(http :  / / xsb .  sourcef  orge . net),  and  illustrate  their  relationships  with  SMOD¬ 
ELS  and  the  resolution-based  skeptical  reasoner  illustrated  in  [2].  The  prototypes 
are  meant  to  demonstrate  the  feasibility  of  these  techniques  as  a  preliminary  step 
toward  more  advanced  implementations. 

2  Theoretical  Preliminaries 

The  dependency  graph  of  a  program  P  is  a  labelled  directed  graph  whose  vertices 
are  the  ground  atoms  of  P’s  language.  Moreover,  i)  there  exists  an  edge  from  B 
to  A  iff  there  is  a  rule  r  €  Ground{P)  with  A  in  the  head  and  an  occurrence 
of  B  in  the  body;  ii)  such  edge  is  labelled  “negative”  if  B  occurs  in  the  scope 
of  and  “positive”  otherwise.  An  atom  A  depends  positively  (resp.  negatively) 
on  B  if  there  is  a  directed  path  from  P  to  A  in  the  dependency  graph  with  an 
even  (resp.  odd)  number  of  negative  edges.  By  odd-cycle  we  mean  a  cycle  in  the 
dependency  graph  with  an  odd  number  of  negative  edges. 

We  say  a  program  P  is  finitary  if  the  following  conditions  hold: 
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Condition  1  For  each  node  A  of  the  dependency  graph  of  P,  the  set  of  all 
nodes  B  such  that  A  depends  (either  positively  or  negatively)  on  B  is  finite. 
Condition  2  Only  a  finite  number  of  nodes  of  the  dependency  graph  of  P 
occurs  in  an  odd-cycle. 

The  relevant  universe  for  a  ground  formula  F  (w.r.t.  program  P),  denoted  by 
U{P,  F),  is  the  set  of  all  ground  atoms  A  such  that  the  dependency  graph  of  P 
contains  a  path  from  A  to  an  atom  occurring  either  in  F  or  in  some  odd-cycle 
of  the  graph. 

The  relevant  subprogram  for  a  ground  formula  F  (w.r.t  program  P),  denoted 
by  P(P,  P),  is  the  set  of  all  rules  in  Ground(P)  whose  head  belongs  to  C/(P,  P). 
The  important  properties  of  relevant  subprograms  are  the  following.  If  P  is 
finitary,  then  for  all  ground  goals  G: 

1.  P(P,  G)  is  finite; 

2.  P  credulously  entails  G  iff  P(P,  G)  does. 

3.  P  skeptically  entails  G  iff  P(P,  G)  does. 

3  Recognizing  Finitary  Programs 

Establishing  whether  a  given  program  is  finitary  is  an  undecidable  problem. 
We  approximate  this  decision  problem  using  program  analysis  techniques.  In 
particular,  Condition  1  is  checked  by  analyzing  the  recursion  patterns  of  the 
input  program,  looking  for  arguments  whose  norm  (a  measure  of  term  size)  does 
not  increase  indefinitely.  The  finitary  program  recognizer  consists  in  four  stages: 

1.  Interargument  analysis.  During  this  phase,  the  mutual  relationships  be¬ 
tween  the  size  of  each  predicate’s  arguments  is  evaluated.  For  example, 
for  each  call  append(A,B,C)  the  analysis  would  discover  that  |A|  <  |C|  and 
|B|  <  |C|.  Inter  argument  information  yields  bounds  on  the  size  of  local  vari¬ 
ables — this  is  essential  for  proving  Condition  1. 

2.  Recursion  analysis.  At  this  stage,  the  recognizer  looks  for  cyclic  atom 
dependencies,  then  refines  them  by  identifying  suitable  recursion  patterns, 
i.e.,  sets  of  predicate  arguments  that  either  strictly  decrease  or  almost  never 
get  larger  at  each  recursion.  After  this  analysis  each  predicate  is  labelled  as 
acyclic  or  potentially  cyclic.  The  underlying  data  structure  is  a  graph  whose 
size  is  linear  in  the  input  (nonground)  program.  Among  other  predicates,  this 
recursion  analysis  verifies  Condition  1  for  all  the  main  standard  predicates 
on  lists,  including  list,  member,  append,  reverse,  and  merge. 

3.  Recursive  domain  predicate  identification.  During  this  phase,  the  an¬ 
alyzer  identifies  acyclic  predicates  that  can  be  evaluated  at  program  instan¬ 
tiation  time  to  instantiate  all  local  variables  in  finitely  many  ways  (thereby 
avoiding  infinite  branching  in  the  dependency  graph).  These  predicates  play 
the  same  role  as  Lparse’s  domain  predicates  [5].  One  important  difference 
is  that  recursive  domain  predicates  can  be  locally  stratified,  while  Lparse’s 
domain  predicates  can  only  be  stratified. 
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4.  Cycle  analysis.  Condition  2  is  checked  at  this  stage.  The  recognizer  looks 
for  odd-cycles  (i.e.  cycles  through  an  odd  number  of  negations) .  Cycle  identi¬ 
fication  takes  into  account  recursion  information  (derived  during  the  second 
stage)  so  that  the  analysis  is  sharper  than  a  simple  inspection  of  the  predicate 
dependency  graph.  For  instance,  the  following  program  would  be  accepted 
and  recognized  as  acyclic: 
even(O) 

even(s(X))  not  even(X) . 

If  a  (potential)  odd  cycle  is  identified,  then  it  is  required  to  be  ground.  A 
sharper  analysis  to  be  included  in  future  versions — is  discussed  in  the  last 
section. 


4  Credulous  Reasoning 

To  use  the  existing  credulous  reasoning  engines,  a  suitable  front-end  is  needed 
whose  function  is  computing  the  (ground)  relevant  subprogram  R{P,G).  The 
relevant  subprogram  can  be  fed  to  the  existing  engines,  including  SMODELS, 
to  answer  the  given  goal  G.  Part  of  the  construction  of  i?(P,  G)  is  common  to 
all  G  and  can  be  factorized.  In  particular  the  part  of  the  program  on  which  the 
odd  cycles  depend  is  always  contained  in  R{P,  G)  and  can  be  pre-computed  once 
and  for  all.  This  part  of  the  computation  needs  the  results  of  the  recognizer’s 
analysis  to  identify  the  odd-cycles.  Recursive  domain  predicates  are  evaluated 
at  instantiation  time.  Currently,  instantiation  proceeds  top-down,  starting  from 
the  input  goal  G.  Since  P  is  finitary,  the  procedure  is  guaranteed  to  terminate. 

5  Skeptical  Reasoning 

A  prototype  skeptical  reasoner  based  on  the  skeptical  resolution  calculus  [2] 
has  been  implemented  in  XSB.  The  prototype  is  a  semi-naive  meta-interpreter. 
A  brief  description  can  be  found  in  the  journal  version  of  [2].  The  credulous 
resolution  calculus  is  sound  for  all  programs  [2]  and  complete  for  all  finitary  pro- 
pams  [1].  In  other  words,  the  prototype  can  be  used  to  find  the  (nonground,  ex¬ 
istentially  quantified)  skeptical  consequences  of  finitary  programs  with  no  mod¬ 
ification  to  the  existing  code  (although  it  had  been  designed  for  function-free 
programs) .  The  relevant  subprogram  has  only  a  theoretical  role,  in  proving  com¬ 
pleteness.  The  instantiation  tool  is  of  no  use  for  skeptical  reasoning,  as  resolu¬ 
tion  instantiates  program  rules  as  needed,  without  necessarily  grounding  them. 
The  program  recognizer  is  still  needed  to  accept  admissible  programs.  As  a  by¬ 
product,  the  recognizer  (approximately)  identifies  cyclic  and  odd-cyclic  atoms; 
such  information  is  needed  by— -and  can  be  fed  to— the  restricted  split  strategy 
implemented  by  the  metainterpreter  (see  [2]  for  a  definition).  In  the  first  ver¬ 
sion  of  the  meta-interpreter,  cycle  information  was  calculated  with  much  less 
precision,  affecting  the  effectiveness  of  the  strategy. 
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6  Future  Enhancements 

Currently  the  recognizer  computes  only  relative  estimates  of  term  size.  For  a 
sharper  analysis,  we  are  planning  to  replace  the  inter  argument  analysis  predi¬ 
cates  with  a  module  based  on  abstract  interpretation,  currently  being  developed 
at  the  University  of  Parma.  The  abstract  domain  consists  of  polyhedra  and  de¬ 
scribes  inter  argument  size  relations  as  linear  equations.  In  order  to  use  such 
information,  the  recognizer  should  be  extended  with  symbolic  calculation  capa¬ 
bilities.  The  term  norms  used  in  the  current  prototype  are  not  the  only  possible 
norms.  Existing  work  on  automatic  norm  selection  will  be  considered  in  future 
implementations.  A  second  enhancement  consists  in  recognizing  a;-stratified  [5] 
subprograms,  that  are  guaranteed  to  be  finitary.  cj-stratification  poses  more  re¬ 
strictions  on  domain  predicates  and  clause  variables,  but  then  all  cycles  are 
guaranteed  to  be  finite,  including  some  that  otherwise  would  not  be  accepted  by 
the  4th  stage  of  the  recognizer. 
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Abstract.  The  characterization  of  stable  models  using  the  monotonic 
logic  of  pertinence  helps  identifying  program  transformations  leading  to 
a  new  normal  form  of  programs.  This  provides  an  alternative  view  on 
automated  reasoning  for  stable  models  from  which  improvements  on  ex¬ 
isting  systems,  e.g.  smodels,  can  be  identified. 


1  Introduction 

In  [2]  a  logic  program  is  characterized  as  a  pertinence  logic  theory,  such  that  for 
each  rule  of  the  program  there  is  a  formula  in  pertinence  logic  in  the  form, 

-^0  ^  -^1 1  •  •  • )  riot  , . . . ,  not  (1^ 

where  n  >  m  >  0,  and  each  Ai  is  an  atom^. 

The  stable  models  of  the  program  correspond  to  the  p-stable  causal  models 
of  the  pertinence  theory.  Causal  models  are  minimal  models  defining  a  non¬ 
monotonic  pertinence  logic;  and  p-stable  models  are  those  models  that  verify  an 
structural  condition. 

Pertinence  logic  is  a  monotonic  logic,  thus  a  semantic  characterization  of 
strongly  equivalent  programs  is  presented.  Two  programs  are  strongly  equivalent 
if  the  (monotonic)  models  of  the  corresponding  pertinence  theories  are  the  same. 

These  results  can  be  used  in  automated  reasoning  for  stable  logic  programs. 
In  the  next  section  we  identify  program  transformations  leading  to  a  new  normal 
form  where  automated  inference  can  be  applied.  This  constitutes  an  alternative 
view  on  inference  for  stable  models  that  includes,  e.g.  program  reduction  based 
on  strong  equivalence  (section  3),  and  computation  of  stable  models  (section  4). 

2  Normal  Form 

For  every  formula  in  the  form  (1)  in  pertinence  logic,  the  following  two  constraint 
formulas  are  equivalent  to  it. 

<  not  Aq^  Ai,  . . . ,  Aj^^  not  . . . ,  not  A^  (2) 

■<  nop  Aq  ,  A\ , . . . ,  Afn ,  not  , . . . ,  not  An  (3) 

In  that  work  the  operator  for  negation  in  truth  and  in  pertinence  was  denoted  notp. 
To  clarify  this  description  here  we  denote  it  by  the  usual  LP  operator  not  to  which 
it  corresponds. 

T.  Eiter,  W.  Faber,  and  M.  Truszczyriski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  420-423,  2001. 
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The  head  Aq  is  ‘moved’  to  the  body  with  negation  in  truth  and  in  pertinence 
corresponding  to  negation  as  failure — in  rule  (2)  and  with  negation  in 
pertinence — a  new  negation  not  defined  in  LP — in  rule  (3). 

To  get  an  intuitive  idea  of  this  transformation  we  recall  from  [2]  that  perti¬ 
nence  logic  gives  to  each  atom  two  truth  values  simultaneously,  one  from  the  set 
{T,F}  corresponding  to  true  and  false,  and  another  from  the  set  {P,N}  corre¬ 
sponding  to  pertinent  and  nonpertinent. 

Atoms  TP  (true  and  pertinent)  will  correspond  to  true  atoms  in  the  2- valued 
semantics  of  stable  models;  and  atoms  FN  (false  and  nonpertinent)  will  cor¬ 
respond  to  false  atoms.  The  other  remaining  valuations — TN  and  FP — do  not 
have  a  correspondence  in  the  2-valued  semantics.  For  instance,  the  structural 
condition  that  defines  p-stahle  models  among  all  the  pertinence  models  is  that 
all  the  atoms  in  the  model  have  valuation  TP  or  FN. 

The  first  constraint  (2)  can  be  read  as:  “there  is  a  contradiction  if  the  body 
is  true  and  pertinent  and  the  head  is  false  and  nonpertinent”;  and  the  second 
constraint  (3)  as:  “there  is  a  contradiction  if  the  body  is  true  and  pertinent  and 
the  head  is  true  and  nonpertinent” .  (In  fact,  we  would  need  a  third  constraint 
for  FP  value  in  order  to  prove  equivalence,  but  see  [2];  in  any  case  the  two 
constraints  are  implied  by  the  rule  and,  as  far  as  stable  models,  this  will  be 
enough  for  this  work.) 

In  summary,  the  introduction  of  the  nop  operator  in  the  syntax  provides  a 
way  to  represent  any  normal  logic  program  as  a  set  of  constraint  rules. 


Example  1. 
form 


Consider  the  following  program  and  the  corresponding  constraint 


q^p 
q  <—  not  p 
p  4—  not  p,  not  q 


4—  not  q^p 
4—  nop  g,  p 
4—  not  g,  not  p 
4—  nop  q,  not  p 
4—  notp^  not  q 
4—  nop  py  not  p,  not  q 


Note  that  the  constraint  <—  not  p,  not  q  is  included  twice,  and  the  constraint 
4—  nop  p,  not  p,  not  q  is  a  tautology  and  can  be  deleted  from  the  program.  □ 


3  Program  Transformations 

Let  us  define  three  transformations  in  constraint  form. 

•  Tautology  Delete  a  constraint  if  an  atom  appears  in  two  different  literals. 
Note  that  in  constraint  form  we  have  three  types  of  literals,  e.g.  p,  notp,  and 
nopp. 

•  Subsumption  Delete  a  constraint  if  it  is  a  superset  of  another  constraint  in 
the  program.  Consider  constraints  denoted  by  the  set  of  its  literals. 

•  Literal  Reduction  If  there  are  three  constraints  that  only  differ  in  one  literal, 
and  this  different  literal  appears  in  the  three  possible  forms  for  the  same 
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atom,  then  replace  the  three  constraints  by  a  constraint  corresponding  to  its 
intersection. 

These  three  transformations  preserve  the  semantics  of  the  program,  the 
deleted  constraints  are  (monotonically)  entailed  by  the  remaining  ones. 

In  Example  1  the  two  constraints  of  rule  p  notp,  notq  will  be  deleted  (the 
rule  is  entailed  hy  q  <r—  notp),  thus  this  rule  can  be  deleted  from  the  program. 

When  the  program  in  Example  1  is  run  in  system  smodels  [1]  the  three  rules 
are  not  simplified  by  Iparse.  It  is  worth  noting  that  the  program  {g  p  g  <— 
notp}  ^the  program  in  the  example  is  strongly  equivalent  to  it — is  simplified  by 
Iparse  to  {g  ^}. 

We  compared  several  programs  strongly  equivalent  by  simple  application  of 
subsumption,  i,e.  the  program  is  translated  to  constraint  form  and  subsumption 
is  applied,  then  back  to  LP  form.  The  speed  up  seems  to  follow  the  relation  on 
the  number  of  rules  in  the  original  program  wrt  the  number  of  rules  that  remain. 

An  extended  Iparse  (or  alternatively  an  additional  simplifier  run  between 
Iparse  and  smodels)  performing  these  reductions  makes  more  efficient  system 
smodels. 


4  Computing  the  Stable  Models 

In  constraint  form  the  program  represents  more  directly  the  interpretations  that 
are  not  models  of  the  rules.  Developing  on  this  idea,  we  provide  an  alternative 
view  on  automated  reasoning  for  stable  models. 

Let  us  represent  the  collection  of  all  the  possible  models  with  a  common 
subset  by  (P,  iV),  where  P  is  the  set  of  positive  atoms  common  and  N  is  the  set 
of  negative  atoms  common,  i.e.  not  present  in  the  models.  Then  for  a  constraint 
of  the  form  (2),  the  collection  that  includes  all  the  non-models  of  it  is 

j  Atti},  {Aq,  A^_|_i,  . . . ,  Aji}). 

All  these  are  not  monotonic  models  of  the  rule  thus  they  are  not  stable  of 
the  program.  We  do  not  need  to  try  these  models  for  stability. 

Consider  now  the  constraints  of  the  form  (3),  we  will  call  them  nop- 
constraints.  This  constraint  also  has  a  collection  of  non-models  associated,  but 
actually  the  constraint  in  the  form  (2) — not-constraint — from  the  same  rule  also 
includes  all  these  as  non-models.  The  nop-constraints  have  one  nop  Aq  literal. 
Thus  they  delete  some  interpretations  that  do  not  correspond  to  2-valued  inter¬ 
pretations. 

Recalling  the  monotonic  characterization  in  [2],  stable  models  of  a  program 
are  minimal  models  (causal  models)  of  the  pertinence  theory.  A  model  is  min¬ 
imal  iff  there  is  no  other  model  for  the  same  truth  with  a  subset  of  pertinent 
atoms.  Intuitively  a  (p-stable)  model  would  be  stable  iff  all  the  other  models  that 
minimize  it,  are  not  models  of  the  theory.  Lets  us  call  minimizers  the  models 
that  minimize  a  p-stable. 
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The  nop-constraints  have  minimizers  as  counter-models.  Every  stable  model 
(unless  {})  has  at  least  one  minimizer.  Thus  every  stable  needs  at  least  a  nop- 
constraint.  Furthermore  we  can  identify  the  collection  of  (possible)  stable  to 
which  a  nop-constraint  deletes  minimizers,  thus  contributing  to  its  stability, 


({-^0?  )  •  •  • )  -^m} )  •  •  •  >  -^n})- 

Then  from  the  nop-constraints  we  get  the  collections  of  all  possible  stable  of 
the  program. 

Example  2.  Consider  the  following  program  and  the  corresponding  collections. 


a  <—  c,  not  b 
6  c,  not  a 
c  not  a 


not' collection  nop- collection 
not  a^nothyC  ({c},{a,6}) 
nopa,notb,c 

not  6,  not  a,  c  ({c} ,  {a,  b}) 
nopb^nota^c  ({c, 

note, not  a  ({}5{ct>c}) 
nope, not  a 


We  only  need  to  search  on  the  collections  associated  to  the  nop-constraints,  and 
that  do  not  appear  in  the  collections  of  the  not- constraints.  n 


Comparing  with  the  characterization  in  which  system  smodels  is  based,  the 
search  is  performed  on  the  set  of  the  negative  antecedents  of  the  rules.  Note  that 
there  is  no  atom  in  the  N  component  of  the  nop-collections  that  is  not  a  negative 
antecedent  of  a  rule.  But  in  the  nop-collections  there  is  additional  information  on 
the  positive  atoms  associated  to  a  particular  subset  of  the  negative  antecedents. 
Furthermore,  some  of  the  combinations  of  negative  antecedents  could  belong  to 
a  not-collection,  thus  no  need  to  try  them.  This  information  is  introduced  in 
system  smodels  and  compared  with  its  strategy. 
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1  General  Information 

DLV  is  an  efficient  Answer  Set  Programming  (ASP)  system  implementing  the 
consistent  answer  set  semantics  [5]  with  various  language  enhancements  like 
support  for  logic  programming  with  inheritance  and  queries,  integer  arithmetics, 
and  various  other  built-in  predicates. 

DLV  is  being  developed  using  GNU  tools  (GCC,  flex,  and  bison)  and  is 
therefore  portable  to  most  Unix- like  platforms  as  well  as  Microsoft  Windows. 
For  up-to-date  information  on  the  system  and  a  full  manual  please  refer  to  the 
URL  http://www.dbai.tuwien.ac.at/proj/dlv/,  where  you  can  also  down¬ 
load  binaries  and  various  examples. 

2  DLV  Language 

The  kernel  language  of  DLV  is  disjunctive  datalog  extended  with  strong  negation 
under  the  consistent  answer  set  semantics  [5]. 

Let  ai,  •  •  • ,  Gn?  ■  •  •  ?  he  classical  literals  (atoms  possibly  preceded  by  the 
classical  negation  symbol  — )  and  n>0,  7ti>A:>0.  A  (disjunctive)  rule  r  is  a 

formula  ...  ^  not  fci+i,---,  not  bm- 

A  strong  constraint  is  a  rule  with  empty  head  (n  =  0).  A  weak  constraint  is 

6i,  •  •  • ,  bk,  not  6^+1,  •  •  • ,  not  bm^  [Weight :  Level] 
where  both  Weight  and  Level  are  positive  integers.  A  disjunctive  datalog  program 
P  is  a  finite  set  of  rules  and  constraints. 

*  This  work  was  supported  by  FWF  (Austrian  Science  Funds)  under  the  projects 
Z29-INF  and  P14781  and  MURST  under  project  COFIN-2000  “From  Data  to  Infor¬ 
mation  (D2I)”. 
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The  semantics  of  these  programs  is  provided  in  [1]  as  an  extension  of  the 
classical  answer  set  semantics  given  in  [5]. 

In  addition  to  its  kernel  language,  DLV  provides  a  number  of  application 
Contends  that  show  the  suitability  of  our  formalism  for  solving  various  problems 
from  the  areas  of  Artificial  Intelligence,  Knowledge  Representation  and  (Deduc¬ 
tive)  Databases.  In  particular,  the  following  frontends  are  currently  available: 
Brave  and  Cautious  Reasoning  Frontend,  Diagnosis  Frontend^  SQL3  Frontend^ 
Inheritance  FYontend,  and  Planning  Frontend. 

3  Representing  Problems  in  dlv 

The  core  language  of  DLV  can  be  used  to  encode  problems  in  a  highly  declara¬ 
tive  fashion.  We  will  next  show  a  number  of  sample  DLV  encodings.  We  will  see 
that  several  problems,  also  problems  of  high  computational  complexity,  can  be 
solved  naturally  in  DLV  by  using  a  declarative  style  of  programming. 

3-Colorability  (3COL)  Given  a  graph,  represented  by  facts  of  the  form 
node(_)  and  edge  assign  each  node  one  of  three  colors  such  that  no  two 

adjacent  nodes  have  the  same  color.  3-Color  ability  is  a  classical  NP-complete 
problem.  In  DLV,  the  problem  can  be  encoded  in  a  very  easy  and  natural  way 
by  means  of  disjunction  and  constraints: 

col{X^red)  v  col{X, green)  v  col{X,blue)  node{X). 
edge{X,Y),col{X,C),col{Y,C). 

The  disjunctive  rule  nondeterministically  chooses  a  color  for  each  node  X  in 
the  graph;  the  constraint  enforces  that  the  choices  are  legal. 

Hamiltonian  Path  (HAMPATH)  is  another  classical  NP-complete  problem 
from  the  area  of  graph  theory: 

Given  an  undirected  graph  G  ~  {V,E),  where  V  is  the  set  of  vertices  of  G 
and  E  is  the  set  of  edges,  and  a  node  a  G  V  of  this  graph,  does  there  exist  a 
path  of  G  starting  at  a  and  passing  through  each  node  in  V  exactly  once? 

Suppose  that  the  graph  G  is  specified  by  using  two  predicates  node{X)  and 
arc{X^Y),  and  the  starting  node  is  specified  by  the  predicate  start{X)  which 
contains  only  a  single  tuple.  Then,  the  following  program  solves  the  problem 
HAMPATH. 

inPath{X,Y)  v  outPath{X,Y)i-  reached{X),arc{X,Y). 

: -node (X), not  reached{X). 
reached{X)  $tart{X). 
reached{X)  inPath{Y^X), 
i-  inPath[X,Y),  inPath{X,Y\),  YoYY 
inPath{X,Y),  inPath(Xl,Y),  XoXl. 

Timetabling  The  problem  consists  of  assigning  course  exams  to  time  slots 
in  such  a  way  that  no  two  exams  are  assigned  the  same  time  slot  if  they  are 
“incompatible” ,  i.e  the  respective  courses  have  a  student  in  common.  Assuming 
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that  there  are  three  time  slots  available,  namely,  si,  S2  and  53,  we  express  the 
problem  as  follows: 

assign{X,sl)  v  assign(X,s2)  v  assign{X,s3)  ccnLrse{X), 

:-assign{X^  S),assign{Y,  5),  incompatible{X,  Y). 

Clearly,  it  may  happen  that  there  is  no  way  to  assign  courses  to  time  slots  with¬ 
out  having  some  overlapping  between  incompatible  courses.  Then,  an  approxi¬ 
mate  solution  where  constraints  are  satisfied  as  much  as  possible  is  desirable.  In 
this  light,  the  problem  at  hand  can  be  restated  as  follows:  assign  courses  to  time 
slots  trying  to  minimize  the  overlapping  of  incompatible  courses.  To  solve  this 
problem  we  resort  to  the  notion  of  weak  constraints,  as  shown  below: 

assign{X,sl)  v  assign{X,s2)  v  assign{X,s'^)i-c(mTse{X). 
assign{X^  S),  assigniy^  S),incompatible{X,  Y). 

Intuitively,  of  the  weak  constraint  above  states:  “Preferably,  do  not  assign  the 
courses  X  and  Y  to  the  same  slot  S  if  they  are  incompatible” . 

Strategic  Companies  (STRATCOMP)  finally,  is  a  i7^-complete  problem[2]: 
A  holding  owns  companies  (7(1), . . . ,  C(c),  each  of  which  produces  some  goods. 
Some  of  these  companies  may  jointly  control  another  one.  This  is  modelled  by 
means  of  predicates  produced J)y[P^  (71,  (72)  —  product  P  is  produced  by  com¬ 
panies  Cl  and  C2  and  controlledJ}y[Cj  Cl,  C2,  C3)  —  company  C  is  jointly 
controlled  by  Cl,  C2  and  C3.  Now,  some  companies  should  be  sold,  under  the 
constraint  that  all  goods  can  be  still  produced,  and  that  no  company  is  sold 
which. would  still  be  controlled  by  the  holding  afterwards.  A  company  is  strate¬ 
gic,  if  it  belongs  to  a  strategic  set,  which  is  a  minimal  set  of  companies  satisfying 
these  constraints.  Checking  whether  any  given  company  C  is  strategic  is  done 
by  brave  reasoning:  “Is  there  any  answer  set  containing  C?” 

strategic{Cl)  v  strategic{C2)  producedJby(P,Cl,C2). 
strategic{C) :  -crnitr oiled Jby{C,  Cl,  C2,  C3), 

strategic{Cl),  strategic{C2),  strategic{CZ). 

We  assume  that  each  product  is  produced  by  at  most  two  companies  and  each 
company  is  jointly  controlled  by  at  most  three  companies  to  allow  for  an  easier 
representation. 

4  System  Architecture 

An  outline  of  the  general  architecture  of  our  system  is  depicted  in  Fig.l. 

The  heart  of  the  system  is  the  DLV  core.  Wrapped  around  this  basic  block 
are  frontend  preprocessors  and  output  filters  (which  also  do  some  post-processing 
for  frontends).  The  system  takes  input  data  from  the  user  via  the  command  line 
and  from  the  file  system  and/or  database  systems. 

Upon  startup,  input  is  possibly  translated  by  a  frontend.  Together  with  rela¬ 
tional  database  tables,  provided  by  an  Oracle  database,  an  Objectivity  database, 
or  ASCII  text  files,  the  Intelligent  Grounding  Module  efficiently  generates  a  sub¬ 
set  of  the  grounded  input  program  that  has  exactly  the  answer  sets  as  the  full 
program,  but  is  much  smaller  in  general. 


System  Description:  DLV  427 


After  that,  the  Model  Generator  is  started.  It  generates  one  answer  set  can¬ 
didate  at  a  time  and  verifies  it  using  the  Model  Checker.  Upon  success,  filtered 
output  is  generated  for  the  answer  set.  This  process  is  iterated  until  either  no 
more  answer  sets  exist  or  an  explicitly  specified  number  of  answer  sets  has  been 
computed. 

Not  shown  in  Fig.l  are  various  additional  data  structures,  such  as  dependency 
graphs. 


Fig.  1.  Overall  architecture  of  DLV 


5  Current  Applications 

Currently,  the  DLV  system  is  used  for  educational  purposes  in  courses  on  Data¬ 
bases  and  on  AI,  both  in  European  and  American  universities.  It  is  also  used  by 
several  researchers  for  knowledge  representation,  for  verifying  theoretical  work, 
and  for  performance  comparisons,  in  which  DLV  compares  favorably  to  similar 
systems  [4,3].  For  the  development  of  some  deductive  database  applications  DLV 
can  compete  with  database  systems.  Indeed,  DLV  is  being  considered  by  CERN 
for  such  an  application  which  could  not  be  handled  by  other  systems.  One  of 
the  latest  applications  of  DLV,  issued  by  the  Italian  national  statistics  institute 
(ISTAT),  concerns  the  automatic  correction  of  census  data. 
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1  Introduction 

DLV^  is  a  knowledge  based  planning  system.  It  is  based  on  the  declarative  lan¬ 
guage  JC  [2],  which  is  similar  in  spirit  to  the  logic-based  language  C  [5],  but  in¬ 
cludes  some  logic-programming  features  (e.g.,  default  negation  and  strong  nega¬ 
tion).  /C  offers  the  following  distinguishing  features: 

•  handling  of  incomplete  knowledge:  for  a  fluent  /,  in  a  state  neither  / 
nor  its  opposite  may  be  known. 

•  nondeterministic  eflfects:  actions  may  have  multiple  possible  outcomes. 

•  optimistic  and  secure  (conformant)  planning:  construction  of  a  “cred¬ 
ulous”  plan  or  a  “sceptical”  plan,  which  works  in  all  cases. 

•  parallel  actions:  More  than  one  action  may  be  executed  simultaneously. 

A  fully  operational  prototype  of  DLV^,  built  as  frontend  on  top  of  the  DLV 
system  [1] ,  is  available  on  the  Web  at 

<URL : http : //www . dbai . tuwien . ac . at /pro j /dlv/>. 


2  The  DLV^  System  by  Example:  Blocks  World 

We  assume  that  the  reader  is  familiar  with  action  languages  and  the  notion  of 
actions,  fluents,  goals,  and  plans;  see  e.g.  [4]  for  a  background.  To  give  a  flavor  of 
DLV^,  we  refer  here  to  well-known  planning  problems  in  the  blocks  world,  which 
require  to  turn  given  configurations  of  blocks  into  other  ones  (see  Figure  1). 

In  DLV^,  problem  domains  are  defined  in  two  parts;  (i)  a  normal  (disjunction- 
free)  stratified  logic  program  representing  the  static  background  knowledge 
of  the  domain,  and  (ii)  a  JC  domain  description. 

The  static  background  knowledge  of  the  blocks  world  consists  of  the  following 
logic  program; 

*  This  work  was  supported  by  FWF  (Austrian  Science  Funds)  under  the  projects 
Z29-INF  and  PI 4781  and  MURST  under  project  COFIN-2000. 
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initial:  goal: 

L 

Fig.  1.  A  blocks  world  example 


block(a) .  block(b) .  block(c) .  location (table) . 
location (B)  block (B) . 

Referring  to  Figure  1,  we  want  to  turn  the  initial  configuration  of  blocks  into 
the  goal  state^  in  three  steps,  where  only  one  block  may  be  moved  in  each  step 
(i.e.,  concurrent  moves  are  not  permitted). 

The  K,  domain  description  uses  an  action  move  and  two  fluents  on  and 
occupied.  We  shall  consider  diflferent  scenarios.  In  a  basic  one,  we  assume  that 
the  knowledge  in  the  initial  state  is  complete  (i.e.,  the  locations  of  all  blocks  are 
known)  and  correctly  specified.  We  will  then  show  how  to  deal  with  incorrect 
and  incomplete  initial  state  specifications. 

Basic  version  of  blocks  world.  The  domain  description  is  in  this  case  as 
follows: 

fluents;  on(B,L)  requires  block(B),  location (L) . 

occupied (B)  requires  location(B) . 
actions:  inove(B,L)  requires  block(B),  location (L) . 

always:  executable  inove(B,L)  if  not  occupied (B) ,  not  occupied(L),B  <>  L. 

inertial  on(B,L). 

caused  occupied(B)  if  on(Bl,B),  block (B) . 
caused  on(B,L)  after  move(B,L). 

caused  -on(B,Ll)  after  move(B,L),  on(B,Ll),  L  <>  LI. 
initially:  on(a, table).  on(b, table).  on(c,a). 
noConcurrency . 

goal:  on(c,b),on(b,a) ,on(a,table)?  (3) 

First,  each  fluent  and  action  has  to  be  declared  using  a  type  declara¬ 
tion,  which  specifies  the  ranges  of  its  arguments.  The  literals  to  the  right  of 
requires  (block (B)  and  location (L))  must  not  involve  default-negation 
“not”  and  must  be  defined  in  the  static  background  knowledge. 

The  next  part  of  our  domain  description  consists  of  execut ability  conditions 
and  causation  rules  describing  the  possible  states  and  transitions.  Intuitively, 
the  executable  statement  for  action  move(B,L)  says  that  a  block  B  can  be 
moved  on  location  L  ^  B  if  both  B  and  L  are  clear  (the  table  is  always  clear), 
in  /C,  multiple  executable  statements  for  the  same  action  are  allowed.  An 

^  This  is  an  implementation  of  the  well-known  Sussman  Anomaly,  similar  to  one  in  [3]. 
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executable  statement  with  empty  body  executable  A.  says  that  the  action  A 
is  always  executable.  Execution  of  an  action  A  under  condition  B  is  forbidden  by 
nonexecutable  A  if  B .  In  case  of  conflicting  specifications,  nonexecutable 
A  overrides  executable  A. 

The  causation  rules  for  on  and  -on  specify  the  dynamic  effects  of  a  move. 
Informally,  a  causation  rule  caused  f  if  Cl  after  A,  C2.  means  that  f  is 
known  to  be  true  if  Cl  holds  in  the  state  and  actions  in  A  have  (not)  been 
executed,  and  condition  C2  was  true  in  the  previous  state.  It  is  worthwhile  noting 
that  the  totality  of  the  fiuenton  is  not  enforced.  Both  on(X,  Y)  and  -on(X,  Y)  may 
happen  to  be  unknown  at  a  given  instant  of  time.  Actually,  the  rule  for  -on  could 
be  replaced  by  “caused  -on(B,Ll)  if  on(B,L),  L  <>  LI stating:  wherever 
a  block  is,  it  is  not  anywhere  else.  This  rule  would  give  us  a  sharper  description 
of  the  state  making  fluent  on  total  at  every  instant  of  time.  However,  using  the 
more  general  rule  would  cause  a  computational  overhead  (as  more  inferences  are 
to  be  done  during  the  computation)  without  providing  relevant  benefits. 

The  statement  inertial  on(X,Y)  .  is  a  shortcut  equivalent  to  the  rule 

caused  on(X,Y)  if  not  -on(X,Y)  after  on(X,Y). 

and  encodes  the  principle  of  inertia  for  positive  knowledge  about  on. 

Static  rules  (i.e.  rules  with  an  empty  after  part)  like 

caused  occupied(B)  if  on(Bl,B),  block (B) . 

model  a  static  causation.  This  can  be  used  to  model  indirect  effects  of  actions. 

The  initially:  section  of  the  domain  description  consists  of  facts/con- 
straints  which  must  be  satisfied  only  in  the  initial  state.  Static  rules  are  also 
allowed  here. 

Simultaneous  execution  of  several  actions  is  normally  allowed  in  JC.  This  can 
be  prohibited  by  the  statement  noConcurrency .  which  enforces  the  execution 
of  at  most  one  action  at  a  time. 

Finally,  the  goal ;  section  defines  the  goal  to  be  reached  and  the  maximum 
plan  length  given  as  a  positive  integer. 

The  execution  of  the  above  DLV^  program  computes  the  following  result: 

PLAN:  move (c, table, 0) ,  move(b,a,l),  move(c,b,2) 

Here,  the  additional  argument  in  a  move  atom  represents  the  instant  of  time 
when  the  action  is  executed.  Thus,  according  to  the  above  plan,  first  c  is  moved 
onto  the  table,  then  b  is  moved  on  top  of  a,  and,  finally,  c  is  moved  onto  b  which 
obviously  leads  to  the  desired  goal. 

Dealing  with  incomplete  knowledge.  To  show  the  advanced  capabilities  of 
DLV^,  we  will  extend  our  example  now  to  deal  with  partial  knowledge.  Here  we 
have  to  verify  that  every  block:  (i)  is  on  top  of  a  unique  location,  (ii)  does  not 
have  more  than  one  block  on  top  of  it,  and  (iii)  is  supported  by  the  table  (i.e., 
it  is  either  on  the  table  or  on  a  stack  of  blocks  which  is  on  the  table)  [6].  To  this 
end,  we  add  a  new  fluent  declaration 
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supported(B)  requires  block(B) . 

and  the  following  rules  in  the  initially  section: 

caused  false  if  on(B,L),  on(B,Ll),  LOLl. 

caused  false  if  on(Bl,B),  on(B2,B),  block (B),  B10B2. 

caused  supported(B)  if  on(B, table). 

caused  supported(B)  if  on(B,Bl),  supported (Bl) . 

caused  false  if  not  supported (B) . 

Note  that,  under  noConcurrency,  the  action  move  preserves  the  properties  (i), 
(ii),  (iii)  above;  thus,  we  do  not  need  to  check  these  properties  in  all  states,  if 
concurrent  actions  are  forbidden. 

A  further  block  Suppose  now  that  we  have  another  block  d  in  Figure  1.  The 
exact  location  of  d  is  unknown,  but  we  know  that  it  is  not  on  top  of  c. 

We  are  interested  in  a  plan  that  works  on  every  possible  initial  state  (i.e., 
no  matter  if  on(d,b)  or  on(d, table)  holds),  and  reaches  the  goal  on(a,c), 
on(c,d)  ,  on(d,b)  ,  on (b, table)  in  four  steps.  We  modify  the  domain  descrip¬ 
tion  by  adding  (i)  -on(d,c)  and  total  on(X,Y)  in  the  initially  section,  and 
(ii)  the  command  securePlan.  In  /C,  we  can  “totalize”  the  knowledge  of  a  fluent 
f  by  declaring  total  f .  which  means  that,  unless  a  truth  value  for  f  is  deriv¬ 
able,  the  cases  where  f  resp.  -f  is  true  will  be  both  considered.  By  securePlan. 
we  ask  the  system  to  compute  only  secure  plans  (alias  conformant  plans  in  the 
literature).  Informally,  a  plan  is  secure,  if  it  works  on  every  legal  initial  state, 

i.e.,  never  gets  stuck  by  nonexecutable  actions  or  a  non-existing  next  state,  and 
always  enforces  the  goal. 

The  execution  of  this  program  on  DLV^  computes  the  following  result: 

PLAN:  move (d, table, 0),  move(d,b,l),  move(c,d,2),  move(a,c,3) 

The  plan  clearly  works  on  the  two  legal  initial  states,  and  leads  always  to  the 
desired  result.  Thus,  the  plan  is  secure.  On  the  other  hand,  the  2-step  plan 
move  (c ,  d,  0)  ,  move  (a,  c ,  1)  it  not  secure.  It  works  for  the  initial  state  in  which 
d  is  on  b,  but  if  d  is  on  the  table,  the  goal  state  is  not  reached  after  its  execution. 

Further  examples  and  detailed  information  on  the  planning  system  can  be 
found  on  our  website  (<URL : http :  //www .  dbai .  tuwien .  ac .  at/pro j /dlv/>) . 
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1  Introduction 

The  Smodels  system  is  an  Answer  Set  Programming  (ASP)  implementation 
based  on  the  stable  model  semantics  of  normal  logic  programs.  The  basic  idea 
of  ASP  is  to  encode  the  constraints  of  a  problem  as  a  logic  program  such  that 
the  answer  sets  (stable  models)  of  the  program  correspond  to  the  solutions  of 
the  problem.  Then  we  can  solve  the  problem  by  letting  a  logic  program  engine 
to  find  the  answer  sets  of  the  program. 

The  Smodels  system  provides  such  an  engine  for  computing  answer  sets.  It 
extends  the  class  of  normal  logic  programs  with  cardinality  and  weight  con¬ 
straints  as  well  as  arithmetic  built-in  functions.  Additionally,  function  symbols 
are  also  supported.  However,  the  input  programs  have  to  be  domain-restricted 
in  a  sense  that  will  be  explained  in  Section  2.  The  Smodels  system  is  available 
for  download  at  http://www.tcs.hut.fi/Software/smodels. 

For  a  practical  Smodels  example,  consider  the  following  logical  puzzle:  On  the 
Island  of  Knights  and  Knaves  there  are  two  types  of  persons:  knights,  who  always 
tell  the  truth,  and  knaves,  who  always  lie.  A  visiting  logician  met  two  natives,  A 
and  B.  When  asked  about  their  types,  A  answered:  “We  are  both  knaves.”  What 
are  their  types?  We  can  solve  the  puzzle  with  the  following  Smodels  program: 

person(a;  b) .  type(knight;  knave). 

1  {  is_type(P,  T)  :  type(T)  }  1  :-  person(P) . 

true^statement  :-  is_type(a,  knave),  is_type(b,  knave). 

:-  is_type(a,  knight),  not  true.statement . 
is_type(a,  knave),  true_statement . 

Here  the  first  line  defines  facts  for  the  persons  and  types,  and  the  second  line 
assigns  a  unique  type  for  all  persons.  The  third  line  tells  that  A’s  answer  is  true 
if  both  of  them  are  knaves  and  the  last  two  rules  enforce  the  condition  that 
knights  never  lie  and  knaves  never  tell  the  truth.  The  constructs  of  the  form 
L  {  •  ,/n}  U  are  cardinality  con.  iraints  that  are  satisfied  whenever  the 

number  of  satisfied  literals  li  is  between  the  integral  bounds  L  and  U ,  inclusive.  A 
cardinality  constraint  in  a  rule  head  imposes  a  non-deterministic  choice  over  the 

*  This  work  has  been  funded  by  Academy  of  Finland  (project  no.  43963).  The  work 
of  the  first  author  has  been  supported  by  HeCSE  and  Tekniikan  edistamissaatio. 
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literals  in  it  when  the  rule  body  is  satisfied.  The  construct  a{X)  :  6(X)  denotes 
the  set  of  atoms  a(X)  for  which  b{X)  also  holds.  For  example,  isjtype{P,T)  : 
type{T)  denotes  the  set  {is-type{P,  knight),  is-type{P^  knave)}. 

The  Smodels  system  itself  consists  of  two  parts,  smodels,  which  is  the  actual 
inference  engine,  and  Iparse,  a  front-end  that  instantiates  and  simplifies  user 
programs.  If  we  have  stored  the  above  program  in  a  file  puzzle .  Ip,  we  can  find 
its  answer  sets  by  invocating  Smodels  as  follows: 

y,  Iparse  -d  none  puzzle. Ip  I  smodels  0 
smodels  version  2.26.  Reading. . .done 
Answer :  1 

Stable  Model:  is_type(a,knave)  is_type(b, knight) 

False 

We  see  that  in  the  unique  solution  ^  is  a  knave  and  jB  is  a  knight.  The  Iparse  ar¬ 
gument  ‘-d  none’  discards  the  domain  predicates  person  and  type  from  the 
output,  and  the  argument  ‘0’  asks  smodels  to  compute  all  stable  models. 

2  Theoretical  Background 

The  Smodels  system  has  a  declarative  formal  semantics  that  extends  the  sta¬ 
ble  model  semantics  of  normal  logic  programs  with  cardinality  and  weight  con¬ 
straints.  Intuitively,  a  set  of  atoms  is  a  stable  model  of  if  it  satisfies  all  rules 
of  the  program  and  each  atom  in  the  model  occurs  as  a  head  of  a  rule  with 
a  satisfied  body.  A  program  may  have  none,  one,  or  many  stable  models.  The 
details  of  the  semantics  are  explained  in  Niemela  and  Simons  [8]. 

The  predicate  symbols  of  a  Smodels  program  P  are  automatically  divided 
into  two  classes,  domain  and  non- domain  predicates.  The  domain  predicates  of  P 
are  the  predicates  that  are  not  defined  in  terms  of  negative  recursion  or  using 
choice  rules,  and  they  form  a  stratified  hierarchy  where  complex  domains  are 
defined  in  terms  of  simple  ones.  The  intuition  of  domain  predicates  is  that  they 
are  used  to  define  the  set  of  terms  over  which  the  variables  range  in  each  rule  of 
the  program.  All  rules  in  a  program  have  to  be  domain-restricted  in  the  sense 
that  each  variable  in  a  rule  has  to  occur  also  in  a  positive  domain  predicate 
in  the  rule  body.  In  a  rule  defining  a  domain  predicate  this  domain  literal  has 
to  belong  to  a  strictly  lower  stratum  than  the  head  of  the  rule.  This  syntactic 
restriction  is  strong  enough  to  guarantee  that  the  problem  of  finding  an  answer 
set  stays  decidable  even  when  function  symbols  are  allowed  since  it  ensures  that 
only  a  finite  number  of  ground  atoms  can  be  derived  using  a  single  rule  [11]. 
Consider  the  following  program: 

nmber (0 . . n)  .  even(O)  . 
even(X+l)  number(X),  odd(X) . 
odd(X+l)  number(X),  even(X) . 
interesting (X)  :*■  odd(X),  not  dull(X). 
dull(X)  odd(X),  not  interesting (X) . 
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The  predicate  number  is  trivially  a  domain  predicate  since  it  does  not  depend 
on  any  other  predicates.  Here  n  is  a  numeric  constant  that  can  be  defined  from 
the  command  line.  Predicates  even  and  odd  are  also  domain  predicates  since 
they  depend  only  on  each  other  and  on  number  that  gives  the  domain  for  X  in 
the  two  recursive  rules.  If  we  left  the  atom  number {X)  out  of  the  rule  bodies, 
the  stable  models  of  the  program  would  be  infinite.  The  predicates  interesting 
and  dull  are  not  domain  predicates  since  they  depend  negatively  on  each  other. 

The  computational  complexity  of  determining  whether  there  exists  an  answer 
set  of  a  propositional  Smodels  program  is  NP-complete.  For  function- free  pro¬ 
grams  with  variables,  it  is  EXP-complete  and  when  function  symbols  are  used, 
it  is  2-EXP-complete  [11].  Thus,  the  high  expressive  power  has  a  significant 
computational  cost. 

3  Implementation 

The  Smodels  system  is  composed  of  two  independent  components,  smodels  and 
Iparse,  that  are  both  implemented  in  C-fH-.  The  smodels  finds  answer  sets 
of  variable-free  primitive  logic  programs  using  a  Davis-Putnam  like  backtrack¬ 
ing  search  procedure  [9].  It  also  uses  inherent  properties  of  the  stable  model 
semantics  to  infer  and  propagate  truth  values  to  prune  the  search  space.  The 
Iparse  is  a  firont-end  that  translates  user  programs  into  the  smodels  internal  for¬ 
mat  by  instantiating  non-ground  rules  and  simplifying  complex  constructs. The 
Iparse  first  creates  the  dependency  graph  of  the  program  and  identifies  the 
domain  predicates  by  computing  strongly  connected  components  of  the  graph. 
Next,  the  extensions  of  the  domain  predicates  are  computed  and  they  are  used 
to  instantiate  the  rest  of  the  rules. 


4  Applications 

The  Smodels  system  is  actively  used  in  dozens  of  research  groups  all  over  the 
world.  Next  we  briefly  review  some  interesting  application  areas  already  studied. 

Planning  is  a  potential  application  area  for  ASP  and  Smodels  has  been  used 
in  a  number  of  approaches,  see  e.g.  [4,10].  An  interesting  example  of  an  advanced 
project  is  research  at  Texas  Tech  University  on  developing  a  decision  support 
system  for  the  flight  controllers  of  space  shuttles  using  the  ASP  approach  [3]. 

General  methodology  for  product  configuration  using  ASP  techniques  has 
been  developed  and  interesting  application  projects  have  been  started,  see 
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e.g.  the  WeCoTin  project  (http://www.soberit.hut.fi/WeCoTiii/).  Research 
on  software  configuration  using  ASP  has  lead  to  a  prototype  configurator  for  the 
Debian  Linux  system  distribution 

(http : //www . tcs . hut , f i/"tssyr j  an/configuration/). 

Smodels  has  been  used  for  a  variety  of  key  inference  tasks  in  computer 
aided  verification.  An  analysis  method  for  Petri  nets  based  on  finite  complete 
prefixes  has  been  built  on  top  of  Smodels,  see  the  mcsmodels  and  unf  smodels 
tools  (http://www.tcs .hut  .fi/^kepa/tools/).  Similarly,  it  has  been  used  in 
a  stubborn  set  reduction  method  for  Petri  net  reachability  analysis,  see  the  prod 
tool  (http://www.tcs.hut.fi/Software/prod/).  Recently,  a  symbolic  model 
checking  method  for  asynchronous  systems  based  on  bounded  model  checking 
techniques  and  ASP  has  been  developed  [5]. 

There  are  a  number  other  interesting  areas  where  Smodels  has  been  em¬ 
ployed  including  logical  cryptanalysis  [6],  security  protocol  analysis  [1],  network 
inhibition  analysis  [2],  and  computation  of  (total  and  partial)  stable  models  for 
disjunctive  programs  [7]. 
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Abstract.  In  this  work  we  show  how  control  knowledge  was  used  to 
improve  planning  in  the  US  A- Ad  visor  decision  support  system  for  the 
Space  Shuttle.  The  US  A- Advisor  is  a  medium  size,  real-world  planning 
application  for  use  by  NASA  flight  controllers  and  contains  over  a  dozen 
domain  dependent  and  domain  independent  heuristics.  Experimental  re¬ 
sults  are  presented  here,  illustrating  how  this  control  knowledge  helps 
improve  both  the  quality  of  plans  as  well  as  overall  system  performance. 


1  Introduction 

This  paper  is  a  report  on  the  development  of  a  medium  size,  real-world  applica¬ 
tion,  the  US  A- Advisor^  -  a  decision  support  system  for  the  Space  Shuttle  flight 
controllers.  While  the  methods  used  in  this  work  are  general  enough  to  model 
any  of  the  subsystems  of  the  shuttle,  our  initial  prototype  models  the  Reaction 
Control  System  (RCS)  [1].  This  system  maneuvers  the  Space  Shuttle  while  it 
is  in  orbit.  In  order  for  the  Space  Shuttle  to  perform  a  given  maneuver,  a  set 
of  jets,  belonging  to  one  or  more  of  the  three  RCS’s  subsystems,  and  pointing 
in  the  correct  directions,  must  be  prepared  to  fire.  Preparing  a  jet  to  fire  in¬ 
volves  providing  an  open,  non-leaking  path  for  the  fuel  to  flow  from  fuel  tanks 
to  the  jet.  Fuel  flow  is  controlled  by  opening  and  closing  valves  through  either 
having  an  astronaut  flip  a  switch  or  by  instructing  the  computer  to  issue  special 
commands.  Switches  are  connected  to  valves  through  fairly  complex  electrical 
circuits. 

Plans  can  been  created  for  simple,  single  failure  situations,  but  it  is  impossible 
to  create  in  advance  plans  for  every  possible  combination  of  failures.  The  USA- 
Advisor  was  designed  to  help  verify  and  generate  plans  for  operation  in  such 
situations.  More  details  on  the  design  of  the  system  can  be  found  in  [2]. 

^  The  USA- Advisor  was  created  with  the  support  of,  United  Space  Alliance  under  Re¬ 
search  Grant  26-3502-21  and  Contract  COC6771311.  The  authors  thank  Matt  Barry 
of  the  USA  Advanced  Technology  Development  Group  for  his  technical  support. 


T.  Eiter,  W.  Faber,  and  M.  Truszczyiiski  (Eds.):  LPNMR  2001,  LNAI  2173,  pp.  439  442,  2001. 
(c)  Springer- Verlag  Berlin  Heidelberg  2001 
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2  The  US  A- Advisor 

The  USA- Advisor  consists  of  a  collection  of  largely  independent  modules  and  a 
graphical  Java  interface.  The  interface  gives  a  simple  way  for  the  user  to  enter  in¬ 
formation  about  the  history  of  the  RCS,  its  faults,  and  the  task  to  be  performed. 
There  are  two  possible  types  of  tasks:  checking  if  a  sequence  of  occurrences  of 
actions  satisfies  a  goal,  G,  and  finding  a  plan  for  G  of  a  length  not  exceeding 
some  number  of  steps,  N.  Based  on  this  information,  the  interface  verifies  if 
the  input  is  complete,  selects  an  appropriate  combination  of  modules,  assembles 
them  into  an  A-Prolog^  program,  77,  and  passes  77  as  an  input  to  Smodels^  - 
a  reasoning  system  for  computing  stable  models.  In  this  approach,  the  task  of 
checking  a  plan  P  is  reduced  to  checking  if  there  exists  a  stable  model  of  the 
program  77  U  P.  A  planning  module  is  used  to  generate  a  set  of  possible  plans 
and  a  correctness  theorem  guarantees  that  there  is  a  one-to-one  correspondence 
between  the  plans  and  the  set  of  stable  models  of  the  program.  Planning  is  re¬ 
duced  to  finding  such  models.  Finally,  the  Java  interface  extracts  the  appropriate 
answer  from  the  Smodels  output  and  displays  it  in  a  user-friendly  format. 

3  The  Planners 

To  investigate  the  role  of  heuristics  on  the  efficiency  of  the  USA-Advisor,  we 
ran  experiments  using  two  different  planning  modules.  The  structure  of  the  Ba¬ 
sic  Planner,  a  planning  module  that  contains  no  control  information,  follows 
the  generate  and  test  approach  from  [3,4].  Since  the  RCS  contains  more  than 
200  actions,  with  rather  complex  effects,  and  may  require  long  plans,  this  stan¬ 
dard  approach  needed  to  be  improved.  This  was  done  by  adding  various  forms 
of  heuristic,  domain-dependent  information.  We  refer  to  the  Basic  Planner  ex¬ 
panded  by  such  heuristics  as  the  Smart  Planner.  The  modular  design  of  the 
USA-Advisor  allows  for  the  creation  of  a  variety  of  such  modules.  Coding  the 
control  knowledge  information  in  A-Prolog  is  straightforward  and  does  not  re¬ 
quire  any  additional  language  features.  We  do  not  present  here  examples  of  such 
heuristics  because  of  space  limitations. 

One  interesting  characteristic  of  the  RCS  domain  is  that  the  goal  can  be 
decomposed  in  independent  subgoals  which  can  be  solved  in  parallel.  Parallel 
subgoals  can  easily  be  coded  in  A-Prolog  and  were  used  in  both  planners. 

4  Experiments 

In  this  section  we  give  an  overview  of  our  experiments  with  the  two  planners 
used  by  the  USA-Advisor.  We  used  a  933  Mhz  Pentium  III  computer  with  128 
MB  of  RAM,  running  the  NetBSD  1.5  Operating  System;  Smodels  version  2.26 
with  input  from  Lparse  version  1.0.2  were  used  to  find  the  plans. 


^  The  language  of  logic  programs  under  the  answer  set  semantics. 
®  http:/ /www.tcs.hut.fi/Software/smodels 
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By  a  test  instance  we  mean  a  collection  of  system  faults  together  with  a  ma¬ 
neuver  to  be  performed  by  the  shuttle.  There  are  two  possible  types  of  faults: 
mechanical  faults  which  render  valves,  switches  or  jets  non-functional;  and  elec¬ 
trical  faults  which  affect  electrical  circuits  by  having  a  value  of  0  or  1  permanently 
present  on  the  input  or  output  wire(s)  of  a  component. 

In  the  first  series  of  experiments,  we:  randomly  generated  a  collection  of 
test  instances  with  a  given  number  of  mechanical  and  electrical  faults;  run  the 
basic  and  the  smart  planners  in  a  loop  with  lasttime  ranging  from  3  to  10.  The 
duration  of  each  iteration  of  the  loop  was  limited  to  10  minutes. 

Overall,  about  500  test  instances  were  generated  in  this  manner.  Here  we 
discuss  the  performance  of  both  planners  for  60  instances  containing  three  me¬ 
chanical  and  two  electrical  faults  (the  most  interesting  situation  from  the  stand¬ 
point  of  the  USA  experts).  In  all  60  cases,  the  Smart  Planner  was  able  to  find 
the  plans  or  discover  their  absence  in  less  than  22  seconds.  The  Basic  Planner 
required  substantially  more  time.  In  some  cases  the  difference  exceeded  2  or¬ 
ders  of  magnitude.  On  average  the  Smart  Planner  was  about  10  times  faster. 
The  plans  generated  by  both  planners  did  not  exceed  15  actions  performed  in  5 
steps.  Other  random  experiments  run  on  tests  with  numbers  of  faults  between 
3  and  8  did  not  produce  any  new  insights.  The  plans  produced  by  the  Smart 
Planner  were  of  good  quality.  They  were  minimal  in  the  number  of  steps  and  sat¬ 
isfied  the  requirements  of  the  USA  experts.  The  Basic  Planner  did  substantially 
worse,  finding  only  one  plan  of  good  quality. 

The  second  series  of  experiments  dealt  with  our  deliberate  attempt  to  crash 
the  system.  We  selected  a  number  of  test  instances  which  correspond  to  especially 
difficult  situations.  Even  though  the  size  of  the  grounded  programs  (up  to  156,500 
rules),  length  of  plans  (7  to  8  time  steps),  and  number  of  actions  involved  (up 
to  24)  are  substantially  larger  than  those  in  the  initial  experiments,  the  time  is 
still  quite  acceptable.  Each  test  run  took  less  than  90  seconds  while  the  USA’s 
requirement  set  the  limit  at  15  minutes.  In  contrast,  the  Basic  Planner  was  not 
able  to  find  solutions  to  any  of  these  problems  -  in  each  instance  we  stopped  the 
planner  after  24  hours  of  CPU  work.  It  is  interesting  to  note  that  to  achieve  this 
performance  we  need  all  of  the  Smart  Planner  heuristics.  Even  though  removal 
of  some  of  them  gave  us  small  improvements  on  a  few  test  instances,  on  others 
the  performance  was  worsened  by  more  than  an  order  of  magnitude. 

5  Conclusion 

In  this  paper  we  described  experiments  with  two  planners  used  by  a  medium 
size  decision  support  system^  written  in  A-Prolog.  The  domain  of  the  planners 
and  their  construction  can  be  of  interest  to  the  reader  from  several  different 
standpoints. 


'*  The  code  for  the  USA-Advisor  and  details  on  experiment  results  are  available  on 
request  from  the  authors. 
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•  Since  a  single  action  of  an  astronaut  changes  the  values  of  many  inter-related 
fluents  of  the  RCS  the  description  of  effects  of  this  action  becomes  a  non¬ 
trivial  task.  We  solved  this  problem  by  using  the  techniques  developed  in 
theory  of  actions  and  change  and  the  power  of  A-Prolog  rules.  It  is  not  clear 
to  us  how  these  effects  could  be  accurately  represented  by  more  traditional 
STRIPS  like  action  languages. 

•  A-Prolog  proved  to  be  a  language  capable  of  specifying  the  initial  situation, 
causal  and  other  relations  of  the  domain,  as  well  as  the  heuristic  information 
limiting  the  search  space  and  improving  quality  of  plans.  This  contrasts 
with  some  of  the  other  representational  approaches  which  require  separate 
languages  for  each  of  these  classes  of  statements. 

•  Answer  set  planning  proved  to  be  a  good  tool  for  our  purpose.  Partly  this 
is  due  to  non-numerical  nature  of  the  problem.  But  the  planner’s  ability  to 
mix  parallel  and  sequential  plans  and  to  efficiently  search  for  them  are  the 
key  ingredients  in  the  success  of  the  project. 

•  The  heuristics  used  in  the  Smart  Planner  were  easy  to  encode  and  to  use. 
Our  experiments  show  that  they  significantly  improve  both,  quality  of  plans 
and  efficiency  of  search. 

•  It  was  interesting  to  notice  that  many  fluents  of  the  RCS  domain  had  nat¬ 
ural  recursive  definitions,  easily  expressible  in  A-Prolog.  This  simplified  the 
representation  but  precluded  the  immediate  use  of  CCALC^  style  planning 
with  satisfiability  solvers.  It  will  be  interesting  to  see  if  such  solvers  could 
be  used  after  some  modifications  of  the  representation.  It  is  probably  also 
worth  mentioning  that  non-monotonicity  of  A-Prolog  played  an  important 
role  in  the  formalization  of  the  domain,  e.g.  in  specifying  the  inertia  axiom, 
closed  world  assumptions  used  for  describing  the  initial  situation,  and  other 
typical  default  knowledge. 

•  The  complexity  of  representing  indirect  effects  of  actions  and  the  seemingly 
essential  presence  of  recursion  make  a  question  of  building  the  USA- Advisor, 
not  based  on  A-Prolog,  an  interesting  (possibly  not  trivial)  open  question. 
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